diff options
Diffstat (limited to 'Documentation')
158 files changed, 10566 insertions, 2215 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 161edbcf905e..33f55917f23f 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -157,7 +157,7 @@ firmware_class/ - request_firmware() hotplug interface info. floppy.txt - notes and driver options for the floppy disk driver. -fujitsu/ +frv/ - Fujitsu FR-V Linux documentation. gpio.txt - overview of GPIO (General Purpose Input/Output) access conventions. @@ -265,6 +265,8 @@ mtrr.txt - how to use PPro Memory Type Range Registers to increase performance. mutex-design.txt - info on the generic mutex subsystem. +namespaces/ + - directory with various information about namespaces nbd.txt - info on a TCP implementation of a network block device. netlabel/ @@ -365,8 +367,6 @@ sharedsubtree.txt - a description of shared subtrees for namespaces. smart-config.txt - description of the Smart Config makefile feature. -smp.txt - - a few notes on symmetric multi-processing. sony-laptop.txt - Sony Notebook Control Driver (SNC) Readme. sonypi.txt diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb index 9734577d1711..11a3c1682cec 100644 --- a/Documentation/ABI/testing/sysfs-bus-usb +++ b/Documentation/ABI/testing/sysfs-bus-usb @@ -52,3 +52,36 @@ Description: facility is inherently dangerous, it is disabled by default for all devices except hubs. For more information, see Documentation/usb/persist.txt. + +What: /sys/bus/usb/device/.../power/connected_duration +Date: January 2008 +KernelVersion: 2.6.25 +Contact: Sarah Sharp <sarah.a.sharp@intel.com> +Description: + If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file + is present. When read, it returns the total time (in msec) + that the USB device has been connected to the machine. This + file is read-only. +Users: + PowerTOP <power@bughost.org> + http://www.lesswatts.org/projects/powertop/ + +What: /sys/bus/usb/device/.../power/active_duration +Date: January 2008 +KernelVersion: 2.6.25 +Contact: Sarah Sharp <sarah.a.sharp@intel.com> +Description: + If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file + is present. When read, it returns the total time (in msec) + that the USB device has been active, i.e. not in a suspended + state. This file is read-only. + + Tools can use this file and the connected_duration file to + compute the percentage of time that a device has been active. + For example, + echo $((100 * `cat active_duration` / `cat connected_duration`)) + will give an integer percentage. Note that this does not + account for counter wrap. +Users: + PowerTOP <power@bughost.org> + http://www.lesswatts.org/projects/powertop/ diff --git a/Documentation/ABI/testing/sysfs-kernel-uids b/Documentation/ABI/testing/sysfs-kernel-uids new file mode 100644 index 000000000000..648d65dbc0e7 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-uids @@ -0,0 +1,14 @@ +What: /sys/kernel/uids/<uid>/cpu_shares +Date: December 2007 +Contact: Dhaval Giani <dhaval@linux.vnet.ibm.com> + Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> +Description: + The /sys/kernel/uids/<uid>/cpu_shares tunable is used + to set the cpu bandwidth a user is allowed. This is a + propotional value. What that means is that if there + are two users logged in, each with an equal number of + shares, then they will get equal CPU bandwidth. Another + example would be, if User A has shares = 1024 and user + B has shares = 2048, User B will get twice the CPU + bandwidth user A will. For more details refer + Documentation/sched-design-CFS.txt diff --git a/Documentation/BUG-HUNTING b/Documentation/BUG-HUNTING index 35f5bd243336..65022a87bf17 100644 --- a/Documentation/BUG-HUNTING +++ b/Documentation/BUG-HUNTING @@ -53,7 +53,7 @@ Finding it the old way [Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)] -This is how to track down a bug if you know nothing about kernel hacking. +This is how to track down a bug if you know nothing about kernel hacking. It's a brute force approach but it works pretty well. You need: @@ -66,12 +66,12 @@ You will then do: . Rebuild a revision that you believe works, install, and verify that. . Do a binary search over the kernels to figure out which one - introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but + introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but you know that 1.3.69 does. Pick a kernel in the middle and build that, like 1.3.50. Build & test; if it works, pick the mid point between .50 and .69, else the mid point between .28 and .50. . You'll narrow it down to the kernel that introduced the bug. You - can probably do better than this but it gets tricky. + can probably do better than this but it gets tricky. . Narrow it down to a subdirectory @@ -81,27 +81,27 @@ You will then do: directories: Copy the non-working directory next to the working directory - as "dir.63". + as "dir.63". One directory at time, try moving the working directory to - "dir.62" and mv dir.63 dir"time, try + "dir.62" and mv dir.63 dir"time, try mv dir dir.62 mv dir.63 dir find dir -name '*.[oa]' -print | xargs rm -f And then rebuild and retest. Assuming that all related - changes were contained in the sub directory, this should - isolate the change to a directory. + changes were contained in the sub directory, this should + isolate the change to a directory. Problems: changes in header files may have occurred; I've - found in my case that they were self explanatory - you may + found in my case that they were self explanatory - you may or may not want to give up when that happens. . Narrow it down to a file - You can apply the same technique to each file in the directory, - hoping that the changes in that file are self contained. - + hoping that the changes in that file are self contained. + . Narrow it down to a routine - You can take the old file and the new file and manually create @@ -130,7 +130,7 @@ You will then do: that makes the difference. Finally, you take all the info that you have, kernel revisions, bug -description, the extent to which you have narrowed it down, and pass +description, the extent to which you have narrowed it down, and pass that off to whomever you believe is the maintainer of that section. A post to linux.dev.kernel isn't such a bad idea if you've done some work to narrow it down. @@ -214,6 +214,23 @@ And recompile the kernel with CONFIG_DEBUG_INFO enabled: gdb vmlinux (gdb) p vt_ioctl (gdb) l *(0x<address of vt_ioctl> + 0xda8) +or, as one command + (gdb) l *(vt_ioctl + 0xda8) + +If you have a call trace, such as :- +>Call Trace: +> [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5 +> [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e +> [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee +> ... +this shows the problem in the :jbd: module. You can load that module in gdb +and list the relevant code. + gdb fs/jbd/jbd.ko + (gdb) p log_wait_commit + (gdb) l *(0x<address> + 0xa3) +or + (gdb) l *(log_wait_commit + 0xa3) + Another very useful option of the Kernel Hacking section in menuconfig is Debug memory allocations. This will help you see whether data has been diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 054a7ecf64c6..6a0ad4715e9f 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -11,7 +11,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ procfs-guide.xml writing_usb_driver.xml \ kernel-api.xml filesystems.xml lsm.xml usb.xml \ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \ - genericirq.xml s390-drivers.xml + genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml ### # The build process is as follows (targets): diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index aa38cc5692a0..059aaf20951a 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -165,6 +165,7 @@ X!Ilib/string.c !Emm/vmalloc.c !Imm/page_alloc.c !Emm/mempool.c +!Emm/dmapool.c !Emm/page-writeback.c !Emm/truncate.c </sect1> @@ -371,7 +372,6 @@ X!Iinclude/linux/device.h !Edrivers/base/class.c !Edrivers/base/firmware_class.c !Edrivers/base/transport_class.c -!Edrivers/base/dmapool.c <!-- Cannot be included, because attribute_container_add_class_device_adapter and attribute_container_classdev_to_container @@ -419,7 +419,13 @@ X!Edrivers/pnp/system.c <chapter id="blkdev"> <title>Block Devices</title> -!Eblock/ll_rw_blk.c +!Eblock/blk-core.c +!Eblock/blk-map.c +!Iblock/blk-sysfs.c +!Eblock/blk-settings.c +!Eblock/blk-exec.c +!Eblock/blk-barrier.c +!Eblock/blk-tag.c </chapter> <chapter id="chrdev"> diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl index 01825ee7db64..2e9d6b41f034 100644 --- a/Documentation/DocBook/kernel-locking.tmpl +++ b/Documentation/DocBook/kernel-locking.tmpl @@ -717,7 +717,7 @@ used, and when it gets full, throws out the least used one. <para> For our first example, we assume that all operations are in user context (ie. from system calls), so we can sleep. This means we can -use a semaphore to protect the cache and all the objects within +use a mutex to protect the cache and all the objects within it. Here's the code: </para> @@ -725,7 +725,7 @@ it. Here's the code: #include <linux/list.h> #include <linux/slab.h> #include <linux/string.h> -#include <asm/semaphore.h> +#include <linux/mutex.h> #include <asm/errno.h> struct object @@ -737,7 +737,7 @@ struct object }; /* Protects the cache, cache_num, and the objects within it */ -static DECLARE_MUTEX(cache_lock); +static DEFINE_MUTEX(cache_lock); static LIST_HEAD(cache); static unsigned int cache_num = 0; #define MAX_CACHE_SIZE 10 @@ -789,17 +789,17 @@ int cache_add(int id, const char *name) obj->id = id; obj->popularity = 0; - down(&cache_lock); + mutex_lock(&cache_lock); __cache_add(obj); - up(&cache_lock); + mutex_unlock(&cache_lock); return 0; } void cache_delete(int id) { - down(&cache_lock); + mutex_lock(&cache_lock); __cache_delete(__cache_find(id)); - up(&cache_lock); + mutex_unlock(&cache_lock); } int cache_find(int id, char *name) @@ -807,13 +807,13 @@ int cache_find(int id, char *name) struct object *obj; int ret = -ENOENT; - down(&cache_lock); + mutex_lock(&cache_lock); obj = __cache_find(id); if (obj) { ret = 0; strcpy(name, obj->name); } - up(&cache_lock); + mutex_unlock(&cache_lock); return ret; } </programlisting> @@ -853,7 +853,7 @@ The change is shown below, in standard patch format: the int popularity; }; --static DECLARE_MUTEX(cache_lock); +-static DEFINE_MUTEX(cache_lock); +static spinlock_t cache_lock = SPIN_LOCK_UNLOCKED; static LIST_HEAD(cache); static unsigned int cache_num = 0; @@ -870,22 +870,22 @@ The change is shown below, in standard patch format: the obj->id = id; obj->popularity = 0; -- down(&cache_lock); +- mutex_lock(&cache_lock); + spin_lock_irqsave(&cache_lock, flags); __cache_add(obj); -- up(&cache_lock); +- mutex_unlock(&cache_lock); + spin_unlock_irqrestore(&cache_lock, flags); return 0; } void cache_delete(int id) { -- down(&cache_lock); +- mutex_lock(&cache_lock); + unsigned long flags; + + spin_lock_irqsave(&cache_lock, flags); __cache_delete(__cache_find(id)); -- up(&cache_lock); +- mutex_unlock(&cache_lock); + spin_unlock_irqrestore(&cache_lock, flags); } @@ -895,14 +895,14 @@ The change is shown below, in standard patch format: the int ret = -ENOENT; + unsigned long flags; -- down(&cache_lock); +- mutex_lock(&cache_lock); + spin_lock_irqsave(&cache_lock, flags); obj = __cache_find(id); if (obj) { ret = 0; strcpy(name, obj->name); } -- up(&cache_lock); +- mutex_unlock(&cache_lock); + spin_unlock_irqrestore(&cache_lock, flags); return ret; } diff --git a/Documentation/DocBook/rapidio.tmpl b/Documentation/DocBook/rapidio.tmpl index 1becf27ba27e..a8b88c47e809 100644 --- a/Documentation/DocBook/rapidio.tmpl +++ b/Documentation/DocBook/rapidio.tmpl @@ -133,9 +133,9 @@ !Idrivers/rapidio/rio-sysfs.c </sect1> <sect1><title>PPC32 support</title> -!Iarch/ppc/kernel/rio.c -!Earch/ppc/syslib/ppc85xx_rio.c -!Iarch/ppc/syslib/ppc85xx_rio.c +!Iarch/powerpc/kernel/rio.c +!Earch/powerpc/sysdev/fsl_rio.c +!Iarch/powerpc/sysdev/fsl_rio.c </sect1> </chapter> diff --git a/Documentation/DocBook/s390-drivers.tmpl b/Documentation/DocBook/s390-drivers.tmpl index 254e769282a4..4acc73240a6d 100644 --- a/Documentation/DocBook/s390-drivers.tmpl +++ b/Documentation/DocBook/s390-drivers.tmpl @@ -59,7 +59,7 @@ <title>Introduction</title> <para> This document describes the interfaces available for device drivers that - drive s390 based channel attached devices. This includes interfaces for + drive s390 based channel attached I/O devices. This includes interfaces for interaction with the hardware and interfaces for interacting with the common driver core. Those interfaces are provided by the s390 common I/O layer. @@ -86,9 +86,10 @@ The ccw bus typically contains the majority of devices available to a s390 system. Named after the channel command word (ccw), the basic command structure used to address its devices, the ccw bus contains - so-called channel attached devices. They are addressed via subchannels, - visible on the css bus. A device driver, however, will never interact - with the subchannel directly, but only via the device on the ccw bus, + so-called channel attached devices. They are addressed via I/O + subchannels, visible on the css bus. A device driver for + channel-attached devices, however, will never interact with the + subchannel directly, but only via the I/O device on the ccw bus, the ccw device. </para> <sect1 id="channelIO"> @@ -146,4 +147,15 @@ </sect1> </chapter> + <chapter id="genericinterfaces"> + <title>Generic interfaces</title> + <para> + Some interfaces are available to other drivers that do not necessarily + have anything to do with the busses described above, but still are + indirectly using basic infrastructure in the common I/O layer. + One example is the support for adapter interrupts. + </para> +!Edrivers/s390/cio/airq.c + </chapter> + </book> diff --git a/Documentation/DocBook/scsi.tmpl b/Documentation/DocBook/scsi.tmpl new file mode 100644 index 000000000000..f299ab182bbe --- /dev/null +++ b/Documentation/DocBook/scsi.tmpl @@ -0,0 +1,409 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="scsimid"> + <bookinfo> + <title>SCSI Interfaces Guide</title> + + <authorgroup> + <author> + <firstname>James</firstname> + <surname>Bottomley</surname> + <affiliation> + <address> + <email>James.Bottomley@steeleye.com</email> + </address> + </affiliation> + </author> + + <author> + <firstname>Rob</firstname> + <surname>Landley</surname> + <affiliation> + <address> + <email>rob@landley.net</email> + </address> + </affiliation> + </author> + + </authorgroup> + + <copyright> + <year>2007</year> + <holder>Linux Foundation</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License version 2. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + + <toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <sect1 id="protocol_vs_bus"> + <title>Protocol vs bus</title> + <para> + Once upon a time, the Small Computer Systems Interface defined both + a parallel I/O bus and a data protocol to connect a wide variety of + peripherals (disk drives, tape drives, modems, printers, scanners, + optical drives, test equipment, and medical devices) to a host + computer. + </para> + <para> + Although the old parallel (fast/wide/ultra) SCSI bus has largely + fallen out of use, the SCSI command set is more widely used than ever + to communicate with devices over a number of different busses. + </para> + <para> + The <ulink url='http://www.t10.org/scsi-3.htm'>SCSI protocol</ulink> + is a big-endian peer-to-peer packet based protocol. SCSI commands + are 6, 10, 12, or 16 bytes long, often followed by an associated data + payload. + </para> + <para> + SCSI commands can be transported over just about any kind of bus, and + are the default protocol for storage devices attached to USB, SATA, + SAS, Fibre Channel, FireWire, and ATAPI devices. SCSI packets are + also commonly exchanged over Infiniband, + <ulink url='http://i2o.shadowconnect.com/faq.php'>I20</ulink>, TCP/IP + (<ulink url='http://en.wikipedia.org/wiki/ISCSI'>iSCSI</ulink>), even + <ulink url='http://cyberelk.net/tim/parport/parscsi.html'>Parallel + ports</ulink>. + </para> + </sect1> + <sect1 id="subsystem_design"> + <title>Design of the Linux SCSI subsystem</title> + <para> + The SCSI subsystem uses a three layer design, with upper, mid, and low + layers. Every operation involving the SCSI subsystem (such as reading + a sector from a disk) uses one driver at each of the 3 levels: one + upper layer driver, one lower layer driver, and the SCSI midlayer. + </para> + <para> + The SCSI upper layer provides the interface between userspace and the + kernel, in the form of block and char device nodes for I/O and + ioctl(). The SCSI lower layer contains drivers for specific hardware + devices. + </para> + <para> + In between is the SCSI mid-layer, analogous to a network routing + layer such as the IPv4 stack. The SCSI mid-layer routes a packet + based data protocol between the upper layer's /dev nodes and the + corresponding devices in the lower layer. It manages command queues, + provides error handling and power management functions, and responds + to ioctl() requests. + </para> + </sect1> + </chapter> + + <chapter id="upper_layer"> + <title>SCSI upper layer</title> + <para> + The upper layer supports the user-kernel interface by providing + device nodes. + </para> + <sect1 id="sd"> + <title>sd (SCSI Disk)</title> + <para>sd (sd_mod.o)</para> +<!-- !Idrivers/scsi/sd.c --> + </sect1> + <sect1 id="sr"> + <title>sr (SCSI CD-ROM)</title> + <para>sr (sr_mod.o)</para> + </sect1> + <sect1 id="st"> + <title>st (SCSI Tape)</title> + <para>st (st.o)</para> + </sect1> + <sect1 id="sg"> + <title>sg (SCSI Generic)</title> + <para>sg (sg.o)</para> + </sect1> + <sect1 id="ch"> + <title>ch (SCSI Media Changer)</title> + <para>ch (ch.c)</para> + </sect1> + </chapter> + + <chapter id="mid_layer"> + <title>SCSI mid layer</title> + + <sect1 id="midlayer_implementation"> + <title>SCSI midlayer implementation</title> + <sect2 id="scsi_device.h"> + <title>include/scsi/scsi_device.h</title> + <para> + </para> +!Iinclude/scsi/scsi_device.h + </sect2> + + <sect2 id="scsi.c"> + <title>drivers/scsi/scsi.c</title> + <para>Main file for the SCSI midlayer.</para> +!Edrivers/scsi/scsi.c + </sect2> + <sect2 id="scsicam.c"> + <title>drivers/scsi/scsicam.c</title> + <para> + <ulink url='http://www.t10.org/ftp/t10/drafts/cam/cam-r12b.pdf'>SCSI + Common Access Method</ulink> support functions, for use with + HDIO_GETGEO, etc. + </para> +!Edrivers/scsi/scsicam.c + </sect2> + <sect2 id="scsi_error.c"> + <title>drivers/scsi/scsi_error.c</title> + <para>Common SCSI error/timeout handling routines.</para> +!Edrivers/scsi/scsi_error.c + </sect2> + <sect2 id="scsi_devinfo.c"> + <title>drivers/scsi/scsi_devinfo.c</title> + <para> + Manage scsi_dev_info_list, which tracks blacklisted and whitelisted + devices. + </para> +!Idrivers/scsi/scsi_devinfo.c + </sect2> + <sect2 id="scsi_ioctl.c"> + <title>drivers/scsi/scsi_ioctl.c</title> + <para> + Handle ioctl() calls for SCSI devices. + </para> +!Edrivers/scsi/scsi_ioctl.c + </sect2> + <sect2 id="scsi_lib.c"> + <title>drivers/scsi/scsi_lib.c</title> + <para> + SCSI queuing library. + </para> +!Edrivers/scsi/scsi_lib.c + </sect2> + <sect2 id="scsi_lib_dma.c"> + <title>drivers/scsi/scsi_lib_dma.c</title> + <para> + SCSI library functions depending on DMA + (map and unmap scatter-gather lists). + </para> +!Edrivers/scsi/scsi_lib_dma.c + </sect2> + <sect2 id="scsi_module.c"> + <title>drivers/scsi/scsi_module.c</title> + <para> + The file drivers/scsi/scsi_module.c contains legacy support for + old-style host templates. It should never be used by any new driver. + </para> + </sect2> + <sect2 id="scsi_proc.c"> + <title>drivers/scsi/scsi_proc.c</title> + <para> + The functions in this file provide an interface between + the PROC file system and the SCSI device drivers + It is mainly used for debugging, statistics and to pass + information directly to the lowlevel driver. + + I.E. plumbing to manage /proc/scsi/* + </para> +!Idrivers/scsi/scsi_proc.c + </sect2> + <sect2 id="scsi_netlink.c"> + <title>drivers/scsi/scsi_netlink.c</title> + <para> + Infrastructure to provide async events from transports to userspace + via netlink, using a single NETLINK_SCSITRANSPORT protocol for all + transports. + + See <ulink url='http://marc.info/?l=linux-scsi&m=115507374832500&w=2'>the + original patch submission</ulink> for more details. + </para> +!Idrivers/scsi/scsi_netlink.c + </sect2> + <sect2 id="scsi_scan.c"> + <title>drivers/scsi/scsi_scan.c</title> + <para> + Scan a host to determine which (if any) devices are attached. + + The general scanning/probing algorithm is as follows, exceptions are + made to it depending on device specific flags, compilation options, + and global variable (boot or module load time) settings. + + A specific LUN is scanned via an INQUIRY command; if the LUN has a + device attached, a scsi_device is allocated and setup for it. + + For every id of every channel on the given host, start by scanning + LUN 0. Skip hosts that don't respond at all to a scan of LUN 0. + Otherwise, if LUN 0 has a device attached, allocate and setup a + scsi_device for it. If target is SCSI-3 or up, issue a REPORT LUN, + and scan all of the LUNs returned by the REPORT LUN; else, + sequentially scan LUNs up until some maximum is reached, or a LUN is + seen that cannot have a device attached to it. + </para> +!Idrivers/scsi/scsi_scan.c + </sect2> + <sect2 id="scsi_sysctl.c"> + <title>drivers/scsi/scsi_sysctl.c</title> + <para> + Set up the sysctl entry: "/dev/scsi/logging_level" + (DEV_SCSI_LOGGING_LEVEL) which sets/returns scsi_logging_level. + </para> + </sect2> + <sect2 id="scsi_sysfs.c"> + <title>drivers/scsi/scsi_sysfs.c</title> + <para> + SCSI sysfs interface routines. + </para> +!Edrivers/scsi/scsi_sysfs.c + </sect2> + <sect2 id="hosts.c"> + <title>drivers/scsi/hosts.c</title> + <para> + mid to lowlevel SCSI driver interface + </para> +!Edrivers/scsi/hosts.c + </sect2> + <sect2 id="constants.c"> + <title>drivers/scsi/constants.c</title> + <para> + mid to lowlevel SCSI driver interface + </para> +!Edrivers/scsi/constants.c + </sect2> + </sect1> + + <sect1 id="Transport_classes"> + <title>Transport classes</title> + <para> + Transport classes are service libraries for drivers in the SCSI + lower layer, which expose transport attributes in sysfs. + </para> + <sect2 id="Fibre_Channel_transport"> + <title>Fibre Channel transport</title> + <para> + The file drivers/scsi/scsi_transport_fc.c defines transport attributes + for Fibre Channel. + </para> +!Edrivers/scsi/scsi_transport_fc.c + </sect2> + <sect2 id="iSCSI_transport"> + <title>iSCSI transport class</title> + <para> + The file drivers/scsi/scsi_transport_iscsi.c defines transport + attributes for the iSCSI class, which sends SCSI packets over TCP/IP + connections. + </para> +!Edrivers/scsi/scsi_transport_iscsi.c + </sect2> + <sect2 id="SAS_transport"> + <title>Serial Attached SCSI (SAS) transport class</title> + <para> + The file drivers/scsi/scsi_transport_sas.c defines transport + attributes for Serial Attached SCSI, a variant of SATA aimed at + large high-end systems. + </para> + <para> + The SAS transport class contains common code to deal with SAS HBAs, + an aproximated representation of SAS topologies in the driver model, + and various sysfs attributes to expose these topologies and managment + interfaces to userspace. + </para> + <para> + In addition to the basic SCSI core objects this transport class + introduces two additional intermediate objects: The SAS PHY + as represented by struct sas_phy defines an "outgoing" PHY on + a SAS HBA or Expander, and the SAS remote PHY represented by + struct sas_rphy defines an "incoming" PHY on a SAS Expander or + end device. Note that this is purely a software concept, the + underlying hardware for a PHY and a remote PHY is the exactly + the same. + </para> + <para> + There is no concept of a SAS port in this code, users can see + what PHYs form a wide port based on the port_identifier attribute, + which is the same for all PHYs in a port. + </para> +!Edrivers/scsi/scsi_transport_sas.c + </sect2> + <sect2 id="SATA_transport"> + <title>SATA transport class</title> + <para> + The SATA transport is handled by libata, which has its own book of + documentation in this directory. + </para> + </sect2> + <sect2 id="SPI_transport"> + <title>Parallel SCSI (SPI) transport class</title> + <para> + The file drivers/scsi/scsi_transport_spi.c defines transport + attributes for traditional (fast/wide/ultra) SCSI busses. + </para> +!Edrivers/scsi/scsi_transport_spi.c + </sect2> + <sect2 id="SRP_transport"> + <title>SCSI RDMA (SRP) transport class</title> + <para> + The file drivers/scsi/scsi_transport_srp.c defines transport + attributes for SCSI over Remote Direct Memory Access. + </para> +!Edrivers/scsi/scsi_transport_srp.c + </sect2> + </sect1> + + </chapter> + + <chapter id="lower_layer"> + <title>SCSI lower layer</title> + <sect1 id="hba_drivers"> + <title>Host Bus Adapter transport types</title> + <para> + Many modern device controllers use the SCSI command set as a protocol to + communicate with their devices through many different types of physical + connections. + </para> + <para> + In SCSI language a bus capable of carrying SCSI commands is + called a "transport", and a controller connecting to such a bus is + called a "host bus adapter" (HBA). + </para> + <sect2 id="scsi_debug.c"> + <title>Debug transport</title> + <para> + The file drivers/scsi/scsi_debug.c simulates a host adapter with a + variable number of disks (or disk like devices) attached, sharing a + common amount of RAM. Does a lot of checking to make sure that we are + not getting blocks mixed up, and panics the kernel if anything out of + the ordinary is seen. + </para> + <para> + To be more realistic, the simulated devices have the transport + attributes of SAS disks. + </para> + <para> + For documentation see + <ulink url='http://www.torque.net/sg/sdebug26.html'>http://www.torque.net/sg/sdebug26.html</ulink> + </para> +<!-- !Edrivers/scsi/scsi_debug.c --> + </sect2> + <sect2 id="todo"> + <title>todo</title> + <para>Parallel (fast/wide/ultra) SCSI, USB, SATA, + SAS, Fibre Channel, FireWire, ATAPI devices, Infiniband, + I20, iSCSI, Parallel ports, netlink... + </para> + </sect2> + </sect1> + </chapter> +</book> diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl index c119484258b8..fdd7f4f887b7 100644 --- a/Documentation/DocBook/uio-howto.tmpl +++ b/Documentation/DocBook/uio-howto.tmpl @@ -30,6 +30,12 @@ <revhistory> <revision> + <revnumber>0.4</revnumber> + <date>2007-11-26</date> + <authorinitials>hjk</authorinitials> + <revremark>Removed section about uio_dummy.</revremark> + </revision> + <revision> <revnumber>0.3</revnumber> <date>2007-04-29</date> <authorinitials>hjk</authorinitials> @@ -94,6 +100,26 @@ interested in translating it, please email me user space. This simplifies development and reduces the risk of serious bugs within a kernel module. </para> + <para> + Please note that UIO is not an universal driver interface. Devices + that are already handled well by other kernel subsystems (like + networking or serial or USB) are no candidates for an UIO driver. + Hardware that is ideally suited for an UIO driver fulfills all of + the following: + </para> +<itemizedlist> +<listitem> + <para>The device has memory that can be mapped. The device can be + controlled completely by writing to this memory.</para> +</listitem> +<listitem> + <para>The device usually generates interrupts.</para> +</listitem> +<listitem> + <para>The device does not fit into one of the standard kernel + subsystems.</para> +</listitem> +</itemizedlist> </sect1> <sect1 id="thanks"> @@ -174,8 +200,9 @@ interested in translating it, please email me For cards that don't generate interrupts but need to be polled, there is the possibility to set up a timer that triggers the interrupt handler at configurable time intervals. - See <filename>drivers/uio/uio_dummy.c</filename> for an - example of this technique. + This interrupt simulation is done by calling + <function>uio_event_notify()</function> + from the timer's event handler. </para> <para> @@ -263,63 +290,11 @@ offset = N * getpagesize(); </sect1> </chapter> -<chapter id="using-uio_dummy" xreflabel="Using uio_dummy"> -<?dbhtml filename="using-uio_dummy.html"?> -<title>Using uio_dummy</title> - <para> - Well, there is no real use for uio_dummy. Its only purpose is - to test most parts of the UIO system (everything except - hardware interrupts), and to serve as an example for the - kernel module that you will have to write yourself. - </para> - -<sect1 id="what_uio_dummy_does"> -<title>What uio_dummy does</title> - <para> - The kernel module <filename>uio_dummy.ko</filename> creates a - device that uses a timer to generate periodic interrupts. The - interrupt handler does nothing but increment a counter. The - driver adds two custom attributes, <varname>count</varname> - and <varname>freq</varname>, that appear under - <filename>/sys/devices/platform/uio_dummy/</filename>. - </para> - - <para> - The attribute <varname>count</varname> can be read and - written. The associated file - <filename>/sys/devices/platform/uio_dummy/count</filename> - appears as a normal text file and contains the total number of - timer interrupts. If you look at it (e.g. using - <function>cat</function>), you'll notice it is slowly counting - up. - </para> - - <para> - The attribute <varname>freq</varname> can be read and written. - The content of - <filename>/sys/devices/platform/uio_dummy/freq</filename> - represents the number of system timer ticks between two timer - interrupts. The default value of <varname>freq</varname> is - the value of the kernel variable <varname>HZ</varname>, which - gives you an interval of one second. Lower values will - increase the frequency. Try the following: - </para> -<programlisting format="linespecific"> -cd /sys/devices/platform/uio_dummy/ -echo 100 > freq -</programlisting> - <para> - Use <function>cat count</function> to see how the interrupt - frequency changes. - </para> -</sect1> -</chapter> - <chapter id="custom_kernel_module" xreflabel="Writing your own kernel module"> <?dbhtml filename="custom_kernel_module.html"?> <title>Writing your own kernel module</title> <para> - Please have a look at <filename>uio_dummy.c</filename> as an + Please have a look at <filename>uio_cif.c</filename> as an example. The following paragraphs explain the different sections of this file. </para> @@ -354,9 +329,8 @@ See the description below for details. interrupt, it's your modules task to determine the irq number during initialization. If you don't have a hardware generated interrupt but want to trigger the interrupt handler in some other way, set -<varname>irq</varname> to <varname>UIO_IRQ_CUSTOM</varname>. The -uio_dummy module does this as it triggers the event mechanism in a timer -routine. If you had no interrupt at all, you could set +<varname>irq</varname> to <varname>UIO_IRQ_CUSTOM</varname>. +If you had no interrupt at all, you could set <varname>irq</varname> to <varname>UIO_IRQ_NONE</varname>, though this rarely makes sense. </para></listitem> diff --git a/Documentation/DocBook/videobook.tmpl b/Documentation/DocBook/videobook.tmpl index b629da33951d..b3d93ee27693 100644 --- a/Documentation/DocBook/videobook.tmpl +++ b/Documentation/DocBook/videobook.tmpl @@ -96,7 +96,6 @@ static struct video_device my_radio { "My radio", VID_TYPE_TUNER, - VID_HARDWARE_MYRADIO, radio_open. radio_close, NULL, /* no read */ @@ -119,13 +118,6 @@ static struct video_device my_radio way to change channel so it is tuneable. </para> <para> - The VID_HARDWARE_ types are unique to each device. Numbers are assigned by - <email>alan@redhat.com</email> when device drivers are going to be released. Until then you - can pull a suitably large number out of your hat and use it. 10000 should be - safe for a very long time even allowing for the huge number of vendors - making new and different radio cards at the moment. - </para> - <para> We declare an open and close routine, but we do not need read or write, which are used to read and write video data to or from the card itself. As we have no read or write there is no poll function. @@ -844,7 +836,6 @@ static struct video_device my_camera "My Camera", VID_TYPE_OVERLAY|VID_TYPE_SCALES|\ VID_TYPE_CAPTURE|VID_TYPE_CHROMAKEY, - VID_HARDWARE_MYCAMERA, camera_open. camera_close, camera_read, /* no read */ diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt index 6221464d1a7e..39ad8f56783a 100644 --- a/Documentation/RCU/RTFP.txt +++ b/Documentation/RCU/RTFP.txt @@ -9,8 +9,8 @@ The first thing resembling RCU was published in 1980, when Kung and Lehman [Kung80] recommended use of a garbage collector to defer destruction of nodes in a parallel binary search tree in order to simplify its implementation. This works well in environments that have garbage -collectors, but current production garbage collectors incur significant -read-side overhead. +collectors, but most production garbage collectors incur significant +overhead. In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring destruction until all threads running at that time have terminated, again @@ -99,16 +99,25 @@ locking, reduces contention, reduces memory latency for readers, and parallelizes pipeline stalls and memory latency for writers. However, these techniques still impose significant read-side overhead in the form of memory barriers. Researchers at Sun worked along similar lines -in the same timeframe [HerlihyLM02,HerlihyLMS03]. These techniques -can be thought of as inside-out reference counts, where the count is -represented by the number of hazard pointers referencing a given data -structure (rather than the more conventional counter field within the -data structure itself). +in the same timeframe [HerlihyLM02]. These techniques can be thought +of as inside-out reference counts, where the count is represented by the +number of hazard pointers referencing a given data structure (rather than +the more conventional counter field within the data structure itself). + +By the same token, RCU can be thought of as a "bulk reference count", +where some form of reference counter covers all reference by a given CPU +or thread during a set timeframe. This timeframe is related to, but +not necessarily exactly the same as, an RCU grace period. In classic +RCU, the reference counter is the per-CPU bit in the "bitmask" field, +and each such bit covers all references that might have been made by +the corresponding CPU during the prior grace period. Of course, RCU +can be thought of in other terms as well. In 2003, the K42 group described how RCU could be used to create -hot-pluggable implementations of operating-system functions. Later that -year saw a paper describing an RCU implementation of System V IPC -[Arcangeli03], and an introduction to RCU in Linux Journal [McKenney03a]. +hot-pluggable implementations of operating-system functions [Appavoo03a]. +Later that year saw a paper describing an RCU implementation of System +V IPC [Arcangeli03], and an introduction to RCU in Linux Journal +[McKenney03a]. 2004 has seen a Linux-Journal article on use of RCU in dcache [McKenney04a], a performance comparison of locking to RCU on several @@ -117,10 +126,19 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper describing how to make RCU safe for soft-realtime applications [Sarma04c], and a paper describing SELinux performance with RCU [JamesMorris04b]. -2005 has seen further adaptation of RCU to realtime use, permitting +2005 brought further adaptation of RCU to realtime use, permitting preemption of RCU realtime critical sections [PaulMcKenney05a, PaulMcKenney05b]. +2006 saw the first best-paper award for an RCU paper [ThomasEHart2006a], +as well as further work on efficient implementations of preemptible +RCU [PaulEMcKenney2006b], but priority-boosting of RCU read-side critical +sections proved elusive. An RCU implementation permitting general +blocking in read-side critical sections appeared [PaulEMcKenney2006c], +Robert Olsson described an RCU-protected trie-hash combination +[RobertOlsson2006a]. + + Bibtex Entries @article{Kung80 @@ -203,6 +221,41 @@ Bibtex Entries ,Address="New Orleans, LA" } +@conference{Pu95a, +Author = "Calton Pu and Tito Autrey and Andrew Black and Charles Consel and +Crispin Cowan and Jon Inouye and Lakshmi Kethana and Jonathan Walpole and +Ke Zhang", +Title = "Optimistic Incremental Specialization: Streamlining a Commercial +Operating System", +Booktitle = "15\textsuperscript{th} ACM Symposium on +Operating Systems Principles (SOSP'95)", +address = "Copper Mountain, CO", +month="December", +year="1995", +pages="314-321", +annotation=" + Uses a replugger, but with a flag to signal when people are + using the resource at hand. Only one reader at a time. +" +} + +@conference{Cowan96a, +Author = "Crispin Cowan and Tito Autrey and Charles Krasic and +Calton Pu and Jonathan Walpole", +Title = "Fast Concurrent Dynamic Linking for an Adaptive Operating System", +Booktitle = "International Conference on Configurable Distributed Systems +(ICCDS'96)", +address = "Annapolis, MD", +month="May", +year="1996", +pages="108", +isbn="0-8186-7395-8", +annotation=" + Uses a replugger, but with a counter to signal when people are + using the resource at hand. Allows multiple readers. +" +} + @techreport{Slingwine95 ,author="John D. Slingwine and Paul E. McKenney" ,title="Apparatus and Method for Achieving Reduced Overhead Mutual @@ -312,6 +365,49 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" [Viewed June 23, 2004]" } +@conference{Michael02a +,author="Maged M. Michael" +,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic +Reads and Writes" +,Year="2002" +,Month="August" +,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM +Symposium on Principles of Distributed Computing}" +,pages="21-30" +,annotation=" + Each thread keeps an array of pointers to items that it is + currently referencing. Sort of an inside-out garbage collection + mechanism, but one that requires the accessing code to explicitly + state its needs. Also requires read-side memory barriers on + most architectures. +" +} + +@conference{Michael02b +,author="Maged M. Michael" +,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets" +,Year="2002" +,Month="August" +,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM +Symposium on Parallel +Algorithms and Architecture}" +,pages="73-82" +,annotation=" + Like the title says... +" +} + +@InProceedings{HerlihyLM02 +,author={Maurice Herlihy and Victor Luchangco and Mark Moir} +,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized, +Lock-Free Data Structures" +,booktitle={Proceedings of 16\textsuperscript{th} International +Symposium on Distributed Computing} +,year=2002 +,month="October" +,pages="339-353" +} + @article{Appavoo03a ,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and @@ -447,3 +543,95 @@ Oregon Health and Sciences University" Realtime turns into making RCU yet more realtime friendly. " } + +@conference{ThomasEHart2006a +,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown" +,Title="Making Lockless Synchronization Fast: Performance Implications +of Memory Reclamation" +,Booktitle="20\textsuperscript{th} {IEEE} International Parallel and +Distributed Processing Symposium" +,month="April" +,year="2006" +,day="25-29" +,address="Rhodes, Greece" +,annotation=" + Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free + reference counting. +" +} + +@Conference{PaulEMcKenney2006b +,Author="Paul E. McKenney and Dipankar Sarma and Ingo Molnar and +Suparna Bhattacharya" +,Title="Extending RCU for Realtime and Embedded Workloads" +,Booktitle="{Ottawa Linux Symposium}" +,Month="July" +,Year="2006" +,pages="v2 123-138" +,note="Available: +\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184} +\url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf} +[Viewed January 1, 2007]" +,annotation=" + Described how to improve the -rt implementation of realtime RCU. +" +} + +@unpublished{PaulEMcKenney2006c +,Author="Paul E. McKenney" +,Title="Sleepable {RCU}" +,month="October" +,day="9" +,year="2006" +,note="Available: +\url{http://lwn.net/Articles/202847/} +Revised: +\url{http://www.rdrop.com/users/paulmck/RCU/srcu.2007.01.14a.pdf} +[Viewed August 21, 2006]" +,annotation=" + LWN article introducing SRCU. +" +} + +@unpublished{RobertOlsson2006a +,Author="Robert Olsson and Stefan Nilsson" +,Title="{TRASH}: A dynamic {LC}-trie and hash data structure" +,month="August" +,day="18" +,year="2006" +,note="Available: +\url{http://www.nada.kth.se/~snilsson/public/papers/trash/trash.pdf} +[Viewed February 24, 2007]" +,annotation=" + RCU-protected dynamic trie-hash combination. +" +} + +@unpublished{ThomasEHart2007a +,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown and Jonathan Walpole" +,Title="Performance of memory reclamation for lockless synchronization" +,journal="J. Parallel Distrib. Comput." +,year="2007" +,note="To appear in J. Parallel Distrib. Comput. + \url{doi=10.1016/j.jpdc.2007.04.010}" +,annotation={ + Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free + reference counting. Journal version of ThomasEHart2006a. +} +} + +@unpublished{PaulEMcKenney2007QRCUspin +,Author="Paul E. McKenney" +,Title="Using Promela and Spin to verify parallel algorithms" +,month="August" +,day="1" +,year="2007" +,note="Available: +\url{http://lwn.net/Articles/243851/} +[Viewed September 8, 2007]" +,annotation=" + LWN article describing Promela and spin, and also using Oleg + Nesterov's QRCU as an example (with Paul McKenney's fastpath). +" +} + diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index f84407cba816..95821a29ae41 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt @@ -36,6 +36,14 @@ o How can the updater tell when a grace period has completed executed in user mode, or executed in the idle loop, we can safely free up that item. + Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the + same effect, but require that the readers manipulate CPU-local + counters. These counters allow limited types of blocking + within RCU read-side critical sections. SRCU also uses + CPU-local counters, and permits general blocking within + RCU read-side critical sections. These two variants of + RCU detect grace periods by sampling these counters. + o If I am running on a uniprocessor kernel, which can only do one thing at a time, why should I wait for a grace period? @@ -46,7 +54,10 @@ o How can I see where RCU is currently used in the Linux kernel? Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu", "rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh", "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu", - "synchronize_net", and "synchronize_srcu". + "synchronize_net", "synchronize_srcu", and the other RCU + primitives. Or grab one of the cscope databases from: + + http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html o What guidelines should I follow when writing code that uses RCU? @@ -67,7 +78,11 @@ o I hear that RCU is patented? What is with that? o I hear that RCU needs work in order to support realtime kernels? - Yes, work in progress. + This work is largely completed. Realtime-friendly RCU can be + enabled via the CONFIG_PREEMPT_RCU kernel configuration parameter. + However, work is in progress for enabling priority boosting of + preempted RCU read-side critical sections.This is needed if you + have CPU-bound realtime threads. o Where can I find more information on RCU? diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 25a3c3f7d378..2967a65269d8 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -46,12 +46,13 @@ stat_interval The number of seconds between output of torture shuffle_interval The number of seconds to keep the test threads affinitied - to a particular subset of the CPUs. Used in conjunction - with test_no_idle_hz. + to a particular subset of the CPUs, defaults to 5 seconds. + Used in conjunction with test_no_idle_hz. test_no_idle_hz Whether or not to test the ability of RCU to operate in a kernel that disables the scheduling-clock interrupt to idle CPUs. Boolean parameter, "1" to test, "0" otherwise. + Defaults to omitting this test. torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, "rcu_sync" for rcu_read_lock() with synchronous reclamation, @@ -82,8 +83,6 @@ be evident. ;-) The entries are as follows: -o "ggp": The number of counter flips (or batches) since boot. - o "rtc": The hexadecimal address of the structure currently visible to readers. @@ -117,8 +116,8 @@ o "Reader Pipe": Histogram of "ages" of structures seen by readers. o "Reader Batch": Another histogram of "ages" of structures seen by readers, but in terms of counter flips (or batches) rather than in terms of grace periods. The legal number of non-zero - entries is again two. The reason for this separate view is - that it is easier to get the third entry to show up in the + entries is again two. The reason for this separate view is that + it is sometimes easier to get the third entry to show up in the "Reader Batch" list than in the "Reader Pipe" list. o "Free-Block Circulation": Shows the number of torture structures diff --git a/Documentation/Smack.txt b/Documentation/Smack.txt new file mode 100644 index 000000000000..989c2fcd8111 --- /dev/null +++ b/Documentation/Smack.txt @@ -0,0 +1,493 @@ + + + "Good for you, you've decided to clean the elevator!" + - The Elevator, from Dark Star + +Smack is the the Simplified Mandatory Access Control Kernel. +Smack is a kernel based implementation of mandatory access +control that includes simplicity in its primary design goals. + +Smack is not the only Mandatory Access Control scheme +available for Linux. Those new to Mandatory Access Control +are encouraged to compare Smack with the other mechanisms +available to determine which is best suited to the problem +at hand. + +Smack consists of three major components: + - The kernel + - A start-up script and a few modified applications + - Configuration data + +The kernel component of Smack is implemented as a Linux +Security Modules (LSM) module. It requires netlabel and +works best with file systems that support extended attributes, +although xattr support is not strictly required. +It is safe to run a Smack kernel under a "vanilla" distribution. +Smack kernels use the CIPSO IP option. Some network +configurations are intolerant of IP options and can impede +access to systems that use them as Smack does. + +The startup script etc-init.d-smack should be installed +in /etc/init.d/smack and should be invoked early in the +start-up process. On Fedora rc5.d/S02smack is recommended. +This script ensures that certain devices have the correct +Smack attributes and loads the Smack configuration if +any is defined. This script invokes two programs that +ensure configuration data is properly formatted. These +programs are /usr/sbin/smackload and /usr/sin/smackcipso. +The system will run just fine without these programs, +but it will be difficult to set access rules properly. + +A version of "ls" that provides a "-M" option to display +Smack labels on long listing is available. + +A hacked version of sshd that allows network logins by users +with specific Smack labels is available. This version does +not work for scp. You must set the /etc/ssh/sshd_config +line: + UsePrivilegeSeparation no + +The format of /etc/smack/usr is: + + username smack + +In keeping with the intent of Smack, configuration data is +minimal and not strictly required. The most important +configuration step is mounting the smackfs pseudo filesystem. + +Add this line to /etc/fstab: + + smackfs /smack smackfs smackfsdef=* 0 0 + +and create the /smack directory for mounting. + +Smack uses extended attributes (xattrs) to store file labels. +The command to set a Smack label on a file is: + + # attr -S -s SMACK64 -V "value" path + +NOTE: Smack labels are limited to 23 characters. The attr command + does not enforce this restriction and can be used to set + invalid Smack labels on files. + +If you don't do anything special all users will get the floor ("_") +label when they log in. If you do want to log in via the hacked ssh +at other labels use the attr command to set the smack value on the +home directory and it's contents. + +You can add access rules in /etc/smack/accesses. They take the form: + + subjectlabel objectlabel access + +access is a combination of the letters rwxa which specify the +kind of access permitted a subject with subjectlabel on an +object with objectlabel. If there is no rule no access is allowed. + +A process can see the smack label it is running with by +reading /proc/self/attr/current. A privileged process can +set the process smack by writing there. + +Look for additional programs on http://schaufler-ca.com + +From the Smack Whitepaper: + +The Simplified Mandatory Access Control Kernel + +Casey Schaufler +casey@schaufler-ca.com + +Mandatory Access Control + +Computer systems employ a variety of schemes to constrain how information is +shared among the people and services using the machine. Some of these schemes +allow the program or user to decide what other programs or users are allowed +access to pieces of data. These schemes are called discretionary access +control mechanisms because the access control is specified at the discretion +of the user. Other schemes do not leave the decision regarding what a user or +program can access up to users or programs. These schemes are called mandatory +access control mechanisms because you don't have a choice regarding the users +or programs that have access to pieces of data. + +Bell & LaPadula + +From the middle of the 1980's until the turn of the century Mandatory Access +Control (MAC) was very closely associated with the Bell & LaPadula security +model, a mathematical description of the United States Department of Defense +policy for marking paper documents. MAC in this form enjoyed a following +within the Capital Beltway and Scandinavian supercomputer centers but was +often sited as failing to address general needs. + +Domain Type Enforcement + +Around the turn of the century Domain Type Enforcement (DTE) became popular. +This scheme organizes users, programs, and data into domains that are +protected from each other. This scheme has been widely deployed as a component +of popular Linux distributions. The administrative overhead required to +maintain this scheme and the detailed understanding of the whole system +necessary to provide a secure domain mapping leads to the scheme being +disabled or used in limited ways in the majority of cases. + +Smack + +Smack is a Mandatory Access Control mechanism designed to provide useful MAC +while avoiding the pitfalls of its predecessors. The limitations of Bell & +LaPadula are addressed by providing a scheme whereby access can be controlled +according to the requirements of the system and its purpose rather than those +imposed by an arcane government policy. The complexity of Domain Type +Enforcement and avoided by defining access controls in terms of the access +modes already in use. + +Smack Terminology + +The jargon used to talk about Smack will be familiar to those who have dealt +with other MAC systems and shouldn't be too difficult for the uninitiated to +pick up. There are four terms that are used in a specific way and that are +especially important: + + Subject: A subject is an active entity on the computer system. + On Smack a subject is a task, which is in turn the basic unit + of execution. + + Object: An object is a passive entity on the computer system. + On Smack files of all types, IPC, and tasks can be objects. + + Access: Any attempt by a subject to put information into or get + information from an object is an access. + + Label: Data that identifies the Mandatory Access Control + characteristics of a subject or an object. + +These definitions are consistent with the traditional use in the security +community. There are also some terms from Linux that are likely to crop up: + + Capability: A task that possesses a capability has permission to + violate an aspect of the system security policy, as identified by + the specific capability. A task that possesses one or more + capabilities is a privileged task, whereas a task with no + capabilities is an unprivileged task. + + Privilege: A task that is allowed to violate the system security + policy is said to have privilege. As of this writing a task can + have privilege either by possessing capabilities or by having an + effective user of root. + +Smack Basics + +Smack is an extension to a Linux system. It enforces additional restrictions +on what subjects can access which objects, based on the labels attached to +each of the subject and the object. + +Labels + +Smack labels are ASCII character strings, one to twenty-three characters in +length. Single character labels using special characters, that being anything +other than a letter or digit, are reserved for use by the Smack development +team. Smack labels are unstructured, case sensitive, and the only operation +ever performed on them is comparison for equality. Smack labels cannot +contain unprintable characters or the "/" (slash) character. + +There are some predefined labels: + + _ Pronounced "floor", a single underscore character. + ^ Pronounced "hat", a single circumflex character. + * Pronounced "star", a single asterisk character. + ? Pronounced "huh", a single question mark character. + +Every task on a Smack system is assigned a label. System tasks, such as +init(8) and systems daemons, are run with the floor ("_") label. User tasks +are assigned labels according to the specification found in the +/etc/smack/user configuration file. + +Access Rules + +Smack uses the traditional access modes of Linux. These modes are read, +execute, write, and occasionally append. There are a few cases where the +access mode may not be obvious. These include: + + Signals: A signal is a write operation from the subject task to + the object task. + Internet Domain IPC: Transmission of a packet is considered a + write operation from the source task to the destination task. + +Smack restricts access based on the label attached to a subject and the label +attached to the object it is trying to access. The rules enforced are, in +order: + + 1. Any access requested by a task labeled "*" is denied. + 2. A read or execute access requested by a task labeled "^" + is permitted. + 3. A read or execute access requested on an object labeled "_" + is permitted. + 4. Any access requested on an object labeled "*" is permitted. + 5. Any access requested by a task on an object with the same + label is permitted. + 6. Any access requested that is explicitly defined in the loaded + rule set is permitted. + 7. Any other access is denied. + +Smack Access Rules + +With the isolation provided by Smack access separation is simple. There are +many interesting cases where limited access by subjects to objects with +different labels is desired. One example is the familiar spy model of +sensitivity, where a scientist working on a highly classified project would be +able to read documents of lower classifications and anything she writes will +be "born" highly classified. To accommodate such schemes Smack includes a +mechanism for specifying rules allowing access between labels. + +Access Rule Format + +The format of an access rule is: + + subject-label object-label access + +Where subject-label is the Smack label of the task, object-label is the Smack +label of the thing being accessed, and access is a string specifying the sort +of access allowed. The Smack labels are limited to 23 characters. The access +specification is searched for letters that describe access modes: + + a: indicates that append access should be granted. + r: indicates that read access should be granted. + w: indicates that write access should be granted. + x: indicates that execute access should be granted. + +Uppercase values for the specification letters are allowed as well. +Access mode specifications can be in any order. Examples of acceptable rules +are: + + TopSecret Secret rx + Secret Unclass R + Manager Game x + User HR w + New Old rRrRr + Closed Off - + +Examples of unacceptable rules are: + + Top Secret Secret rx + Ace Ace r + Odd spells waxbeans + +Spaces are not allowed in labels. Since a subject always has access to files +with the same label specifying a rule for that case is pointless. Only +valid letters (rwxaRWXA) and the dash ('-') character are allowed in +access specifications. The dash is a placeholder, so "a-r" is the same +as "ar". A lone dash is used to specify that no access should be allowed. + +Applying Access Rules + +The developers of Linux rarely define new sorts of things, usually importing +schemes and concepts from other systems. Most often, the other systems are +variants of Unix. Unix has many endearing properties, but consistency of +access control models is not one of them. Smack strives to treat accesses as +uniformly as is sensible while keeping with the spirit of the underlying +mechanism. + +File system objects including files, directories, named pipes, symbolic links, +and devices require access permissions that closely match those used by mode +bit access. To open a file for reading read access is required on the file. To +search a directory requires execute access. Creating a file with write access +requires both read and write access on the containing directory. Deleting a +file requires read and write access to the file and to the containing +directory. It is possible that a user may be able to see that a file exists +but not any of its attributes by the circumstance of having read access to the +containing directory but not to the differently labeled file. This is an +artifact of the file name being data in the directory, not a part of the file. + +IPC objects, message queues, semaphore sets, and memory segments exist in flat +namespaces and access requests are only required to match the object in +question. + +Process objects reflect tasks on the system and the Smack label used to access +them is the same Smack label that the task would use for its own access +attempts. Sending a signal via the kill() system call is a write operation +from the signaler to the recipient. Debugging a process requires both reading +and writing. Creating a new task is an internal operation that results in two +tasks with identical Smack labels and requires no access checks. + +Sockets are data structures attached to processes and sending a packet from +one process to another requires that the sender have write access to the +receiver. The receiver is not required to have read access to the sender. + +Setting Access Rules + +The configuration file /etc/smack/accesses contains the rules to be set at +system startup. The contents are written to the special file /smack/load. +Rules can be written to /smack/load at any time and take effect immediately. +For any pair of subject and object labels there can be only one rule, with the +most recently specified overriding any earlier specification. + +The program smackload is provided to ensure data is formatted +properly when written to /smack/load. This program reads lines +of the form + + subjectlabel objectlabel mode. + +Task Attribute + +The Smack label of a process can be read from /proc/<pid>/attr/current. A +process can read its own Smack label from /proc/self/attr/current. A +privileged process can change its own Smack label by writing to +/proc/self/attr/current but not the label of another process. + +File Attribute + +The Smack label of a filesystem object is stored as an extended attribute +named SMACK64 on the file. This attribute is in the security namespace. It can +only be changed by a process with privilege. + +Privilege + +A process with CAP_MAC_OVERRIDE is privileged. + +Smack Networking + +As mentioned before, Smack enforces access control on network protocol +transmissions. Every packet sent by a Smack process is tagged with its Smack +label. This is done by adding a CIPSO tag to the header of the IP packet. Each +packet received is expected to have a CIPSO tag that identifies the label and +if it lacks such a tag the network ambient label is assumed. Before the packet +is delivered a check is made to determine that a subject with the label on the +packet has write access to the receiving process and if that is not the case +the packet is dropped. + +CIPSO Configuration + +It is normally unnecessary to specify the CIPSO configuration. The default +values used by the system handle all internal cases. Smack will compose CIPSO +label values to match the Smack labels being used without administrative +intervention. Unlabeled packets that come into the system will be given the +ambient label. + +Smack requires configuration in the case where packets from a system that is +not smack that speaks CIPSO may be encountered. Usually this will be a Trusted +Solaris system, but there are other, less widely deployed systems out there. +CIPSO provides 3 important values, a Domain Of Interpretation (DOI), a level, +and a category set with each packet. The DOI is intended to identify a group +of systems that use compatible labeling schemes, and the DOI specified on the +smack system must match that of the remote system or packets will be +discarded. The DOI is 3 by default. The value can be read from /smack/doi and +can be changed by writing to /smack/doi. + +The label and category set are mapped to a Smack label as defined in +/etc/smack/cipso. + +A Smack/CIPSO mapping has the form: + + smack level [category [category]*] + +Smack does not expect the level or category sets to be related in any +particular way and does not assume or assign accesses based on them. Some +examples of mappings: + + TopSecret 7 + TS:A,B 7 1 2 + SecBDE 5 2 4 6 + RAFTERS 7 12 26 + +The ":" and "," characters are permitted in a Smack label but have no special +meaning. + +The mapping of Smack labels to CIPSO values is defined by writing to +/smack/cipso. Again, the format of data written to this special file +is highly restrictive, so the program smackcipso is provided to +ensure the writes are done properly. This program takes mappings +on the standard input and sends them to /smack/cipso properly. + +In addition to explicit mappings Smack supports direct CIPSO mappings. One +CIPSO level is used to indicate that the category set passed in the packet is +in fact an encoding of the Smack label. The level used is 250 by default. The +value can be read from /smack/direct and changed by writing to /smack/direct. + +Socket Attributes + +There are two attributes that are associated with sockets. These attributes +can only be set by privileged tasks, but any task can read them for their own +sockets. + + SMACK64IPIN: The Smack label of the task object. A privileged + program that will enforce policy may set this to the star label. + + SMACK64IPOUT: The Smack label transmitted with outgoing packets. + A privileged program may set this to match the label of another + task with which it hopes to communicate. + +Writing Applications for Smack + +There are three sorts of applications that will run on a Smack system. How an +application interacts with Smack will determine what it will have to do to +work properly under Smack. + +Smack Ignorant Applications + +By far the majority of applications have no reason whatever to care about the +unique properties of Smack. Since invoking a program has no impact on the +Smack label associated with the process the only concern likely to arise is +whether the process has execute access to the program. + +Smack Relevant Applications + +Some programs can be improved by teaching them about Smack, but do not make +any security decisions themselves. The utility ls(1) is one example of such a +program. + +Smack Enforcing Applications + +These are special programs that not only know about Smack, but participate in +the enforcement of system policy. In most cases these are the programs that +set up user sessions. There are also network services that provide information +to processes running with various labels. + +File System Interfaces + +Smack maintains labels on file system objects using extended attributes. The +Smack label of a file, directory, or other file system object can be obtained +using getxattr(2). + + len = getxattr("/", "security.SMACK64", value, sizeof (value)); + +will put the Smack label of the root directory into value. A privileged +process can set the Smack label of a file system object with setxattr(2). + + len = strlen("Rubble"); + rc = setxattr("/foo", "security.SMACK64", "Rubble", len, 0); + +will set the Smack label of /foo to "Rubble" if the program has appropriate +privilege. + +Socket Interfaces + +The socket attributes can be read using fgetxattr(2). + +A privileged process can set the Smack label of outgoing packets with +fsetxattr(2). + + len = strlen("Rubble"); + rc = fsetxattr(fd, "security.SMACK64IPOUT", "Rubble", len, 0); + +will set the Smack label "Rubble" on packets going out from the socket if the +program has appropriate privilege. + + rc = fsetxattr(fd, "security.SMACK64IPIN, "*", strlen("*"), 0); + +will set the Smack label "*" as the object label against which incoming +packets will be checked if the program has appropriate privilege. + +Administration + +Smack supports some mount options: + + smackfsdef=label: specifies the label to give files that lack + the Smack label extended attribute. + + smackfsroot=label: specifies the label to assign the root of the + file system if it lacks the Smack extended attribute. + + smackfshat=label: specifies a label that must have read access to + all labels set on the filesystem. Not yet enforced. + + smackfsfloor=label: specifies a label to which all labels set on the + filesystem must have read access. Not yet enforced. + +These mount options apply to all file system types. + diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 681e2b36195c..08a1ed1cb5d8 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -220,20 +220,8 @@ decreasing the likelihood of your MIME-attached change being accepted. Exception: If your mailer is mangling patches then someone may ask you to re-send them using MIME. - -WARNING: Some mailers like Mozilla send your messages with ----- message header ---- -Content-Type: text/plain; charset=us-ascii; format=flowed ----- message header ---- -The problem is that "format=flowed" makes some of the mailers -on receiving side to replace TABs with spaces and do similar -changes. Thus the patches from you can look corrupted. - -To fix this just make your mozilla defaults/pref/mailnews.js file to look like: -pref("mailnews.send_plaintext_flowed", false); // RFC 2646======= -pref("mailnews.display.disable_format_flowed_support", true); - - +See Documentation/email-clients.txt for hints about configuring +your e-mail client so that it sends your patches untouched. 8) E-mail size. diff --git a/Documentation/arm/Sharp-LH/IOBarrier b/Documentation/arm/Sharp-LH/IOBarrier index c0d8853672dc..2e953e228f4d 100644 --- a/Documentation/arm/Sharp-LH/IOBarrier +++ b/Documentation/arm/Sharp-LH/IOBarrier @@ -32,7 +32,7 @@ BARRIER IO before the access to the SMC chip because the AEN latch only needs occurs after the SMC IO write cycle. The routines that implement this work-around make an additional concession which is to disable interrupts during the IO sequence. Other hardware devices -(the LogicPD CPLD) have registers in the same the physical memory +(the LogicPD CPLD) have registers in the same physical memory region as the SMC chip. An interrupt might allow an access to one of those registers while SMC IO is being performed. diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt index 555c8cf3650a..af3b925ece08 100644 --- a/Documentation/cpu-freq/user-guide.txt +++ b/Documentation/cpu-freq/user-guide.txt @@ -45,6 +45,7 @@ The following ARM processors are supported by cpufreq: ARM Integrator ARM-SA1100 ARM-SA1110 +Intel PXA 1.2 x86 diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt index a741f658a3c9..ba0aacde94fb 100644 --- a/Documentation/cpu-hotplug.txt +++ b/Documentation/cpu-hotplug.txt @@ -50,7 +50,7 @@ additional_cpus=n (*) Use this to limit hotpluggable cpus. This option sets cpu_possible_map = cpu_present_map + additional_cpus (*) Option valid only for following architectures -- x86_64, ia64, s390 +- x86_64, ia64 ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT to determine the number of potentially hot-pluggable cpus. The implementation @@ -109,12 +109,13 @@ Never use anything other than cpumask_t to represent bitmap of CPUs. for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask. #include <linux/cpu.h> - lock_cpu_hotplug() and unlock_cpu_hotplug(): + get_online_cpus() and put_online_cpus(): -The above calls are used to inhibit cpu hotplug operations. While holding the -cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid -cpus going away, you could also use preempt_disable() and preempt_enable() -for those sections. Just remember the critical section cannot call any +The above calls are used to inhibit cpu hotplug operations. While the +cpu_hotplug.refcount is non zero, the cpu_online_map will not change. +If you merely need to avoid cpus going away, you could also use +preempt_disable() and preempt_enable() for those sections. +Just remember the critical section cannot call any function that can sleep or schedule this process away. The preempt_disable() will work as long as stop_machine_run() is used to take a cpu down. diff --git a/Documentation/crypto/api-intro.txt b/Documentation/crypto/api-intro.txt index a2ac6d294793..8b49302712a8 100644 --- a/Documentation/crypto/api-intro.txt +++ b/Documentation/crypto/api-intro.txt @@ -33,9 +33,16 @@ The idea is to make the user interface and algorithm registration API very simple, while hiding the core logic from both. Many good ideas from existing APIs such as Cryptoapi and Nettle have been adapted for this. -The API currently supports three types of transforms: Ciphers, Digests and -Compressors. The compression algorithms especially seem to be performing -very well so far. +The API currently supports five main types of transforms: AEAD (Authenticated +Encryption with Associated Data), Block Ciphers, Ciphers, Compressors and +Hashes. + +Please note that Block Ciphers is somewhat of a misnomer. It is in fact +meant to support all ciphers including stream ciphers. The difference +between Block Ciphers and Ciphers is that the latter operates on exactly +one block while the former can operate on an arbitrary amount of data, +subject to block size requirements (i.e., non-stream ciphers can only +process multiples of blocks). Support for hardware crypto devices via an asynchronous interface is under development. @@ -69,29 +76,12 @@ Here's an example of how to use the API: Many real examples are available in the regression test module (tcrypt.c). -CONFIGURATION NOTES - -As Triple DES is part of the DES module, for those using modular builds, -add the following line to /etc/modprobe.conf: - - alias des3_ede des - -The Null algorithms reside in the crypto_null module, so these lines -should also be added: - - alias cipher_null crypto_null - alias digest_null crypto_null - alias compress_null crypto_null - -The SHA384 algorithm shares code within the SHA512 module, so you'll -also need: - alias sha384 sha512 - - DEVELOPER NOTES Transforms may only be allocated in user context, and cryptographic -methods may only be called from softirq and user contexts. +methods may only be called from softirq and user contexts. For +transforms with a setkey method it too should only be called from +user context. When using the API for ciphers, performance will be optimal if each scatterlist contains data which is a multiple of the cipher's block @@ -130,8 +120,9 @@ might already be working on. BUGS Send bug reports to: -Herbert Xu <herbert@gondor.apana.org.au> -Cc: David S. Miller <davem@redhat.com> +linux-crypto@vger.kernel.org +Cc: Herbert Xu <herbert@gondor.apana.org.au>, + David S. Miller <davem@redhat.com> FURTHER INFORMATION diff --git a/Documentation/debugging-modules.txt b/Documentation/debugging-modules.txt index 24029f65fc94..172ad4aec493 100644 --- a/Documentation/debugging-modules.txt +++ b/Documentation/debugging-modules.txt @@ -16,3 +16,7 @@ echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe chmod a+x /tmp/modprobe echo /tmp/modprobe > /proc/sys/kernel/modprobe + +Note that the above applies only when the *kernel* is requesting +that the module be loaded -- it won't have any effect if that module +is being loaded explicitly using "modprobe" from userspace. diff --git a/Documentation/debugging-via-ohci1394.txt b/Documentation/debugging-via-ohci1394.txt new file mode 100644 index 000000000000..de4804e8b396 --- /dev/null +++ b/Documentation/debugging-via-ohci1394.txt @@ -0,0 +1,179 @@ + + Using physical DMA provided by OHCI-1394 FireWire controllers for debugging + --------------------------------------------------------------------------- + +Introduction +------------ + +Basically all FireWire controllers which are in use today are compliant +to the OHCI-1394 specification which defines the controller to be a PCI +bus master which uses DMA to offload data transfers from the CPU and has +a "Physical Response Unit" which executes specific requests by employing +PCI-Bus master DMA after applying filters defined by the OHCI-1394 driver. + +Once properly configured, remote machines can send these requests to +ask the OHCI-1394 controller to perform read and write requests on +physical system memory and, for read requests, send the result of +the physical memory read back to the requester. + +With that, it is possible to debug issues by reading interesting memory +locations such as buffers like the printk buffer or the process table. + +Retrieving a full system memory dump is also possible over the FireWire, +using data transfer rates in the order of 10MB/s or more. + +Memory access is currently limited to the low 4G of physical address +space which can be a problem on IA64 machines where memory is located +mostly above that limit, but it is rarely a problem on more common +hardware such as hardware based on x86, x86-64 and PowerPC. + +Together with a early initialization of the OHCI-1394 controller for debugging, +this facility proved most useful for examining long debugs logs in the printk +buffer on to debug early boot problems in areas like ACPI where the system +fails to boot and other means for debugging (serial port) are either not +available (notebooks) or too slow for extensive debug information (like ACPI). + +Drivers +------- + +The OHCI-1394 drivers in drivers/firewire and drivers/ieee1394 initialize +the OHCI-1394 controllers to a working state and can be used to enable +physical DMA. By default you only have to load the driver, and physical +DMA access will be granted to all remote nodes, but it can be turned off +when using the ohci1394 driver. + +Because these drivers depend on the PCI enumeration to be completed, an +initialization routine which can runs pretty early (long before console_init(), +which makes the printk buffer appear on the console can be called) was written. + +To activate it, enable CONFIG_PROVIDE_OHCI1394_DMA_INIT (Kernel hacking menu: +Provide code for enabling DMA over FireWire early on boot) and pass the +parameter "ohci1394_dma=early" to the recompiled kernel on boot. + +Tools +----- + +firescope - Originally developed by Benjamin Herrenschmidt, Andi Kleen ported +it from PowerPC to x86 and x86_64 and added functionality, firescope can now +be used to view the printk buffer of a remote machine, even with live update. + +Bernhard Kaindl enhanced firescope to support accessing 64-bit machines +from 32-bit firescope and vice versa: +- ftp://ftp.suse.de/private/bk/firewire/tools/firescope-0.2.2.tar.bz2 + +and he implemented fast system dump (alpha version - read README.txt): +- ftp://ftp.suse.de/private/bk/firewire/tools/firedump-0.1.tar.bz2 + +There is also a gdb proxy for firewire which allows to use gdb to access +data which can be referenced from symbols found by gdb in vmlinux: +- ftp://ftp.suse.de/private/bk/firewire/tools/fireproxy-0.33.tar.bz2 + +The latest version of this gdb proxy (fireproxy-0.34) can communicate (not +yet stable) with kgdb over an memory-based communication module (kgdbom). + +Getting Started +--------------- + +The OHCI-1394 specification regulates that the OHCI-1394 controller must +disable all physical DMA on each bus reset. + +This means that if you want to debug an issue in a system state where +interrupts are disabled and where no polling of the OHCI-1394 controller +for bus resets takes place, you have to establish any FireWire cable +connections and fully initialize all FireWire hardware __before__ the +system enters such state. + +Step-by-step instructions for using firescope with early OHCI initialization: + +1) Verify that your hardware is supported: + + Load the ohci1394 or the fw-ohci module and check your kernel logs. + You should see a line similar to + + ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[18] MMIO=[fe9ff800-fe9fffff] + ... Max Packet=[2048] IR/IT contexts=[4/8] + + when loading the driver. If you have no supported controller, many PCI, + CardBus and even some Express cards which are fully compliant to OHCI-1394 + specification are available. If it requires no driver for Windows operating + systems, it most likely is. Only specialized shops have cards which are not + compliant, they are based on TI PCILynx chips and require drivers for Win- + dows operating systems. + +2) Establish a working FireWire cable connection: + + Any FireWire cable, as long at it provides electrically and mechanically + stable connection and has matching connectors (there are small 4-pin and + large 6-pin FireWire ports) will do. + + If an driver is running on both machines you should see a line like + + ieee1394: Node added: ID:BUS[0-01:1023] GUID[0090270001b84bba] + + on both machines in the kernel log when the cable is plugged in + and connects the two machines. + +3) Test physical DMA using firescope: + + On the debug host, + - load the raw1394 module, + - make sure that /dev/raw1394 is accessible, + then start firescope: + + $ firescope + Port 0 (ohci1394) opened, 2 nodes detected + + FireScope + --------- + Target : <unspecified> + Gen : 1 + [Ctrl-T] choose target + [Ctrl-H] this menu + [Ctrl-Q] quit + + ------> Press Ctrl-T now, the output should be similar to: + + 2 nodes available, local node is: 0 + 0: ffc0, uuid: 00000000 00000000 [LOCAL] + 1: ffc1, uuid: 00279000 ba4bb801 + + Besides the [LOCAL] node, it must show another node without error message. + +4) Prepare for debugging with early OHCI-1394 initialization: + + 4.1) Kernel compilation and installation on debug target + + Compile the kernel to be debugged with CONFIG_PROVIDE_OHCI1394_DMA_INIT + (Kernel hacking: Provide code for enabling DMA over FireWire early on boot) + enabled and install it on the machine to be debugged (debug target). + + 4.2) Transfer the System.map of the debugged kernel to the debug host + + Copy the System.map of the kernel be debugged to the debug host (the host + which is connected to the debugged machine over the FireWire cable). + +5) Retrieving the printk buffer contents: + + With the FireWire cable connected, the OHCI-1394 driver on the debugging + host loaded, reboot the debugged machine, booting the kernel which has + CONFIG_PROVIDE_OHCI1394_DMA_INIT enabled, with the option ohci1394_dma=early. + + Then, on the debugging host, run firescope, for example by using -A: + + firescope -A System.map-of-debug-target-kernel + + Note: -A automatically attaches to the first non-local node. It only works + reliably if only connected two machines are connected using FireWire. + + After having attached to the debug target, press Ctrl-D to view the + complete printk buffer or Ctrl-U to enter auto update mode and get an + updated live view of recent kernel messages logged on the debug target. + + Call "firescope -h" to get more information on firescope's options. + +Notes +----- +Documentation and specifications: ftp://ftp.suse.de/private/bk/firewire/docs + +FireWire is a trademark of Apple Inc. - for more information please refer to: +http://en.wikipedia.org/wiki/FireWire diff --git a/Documentation/dontdiff b/Documentation/dontdiff index f2d658a6a942..c09a96b99354 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -46,8 +46,6 @@ .mailmap .mm 53c700_d.h -53c7xx_d.h -53c7xx_u.h 53c8xx_d.h* BitKeeper COPYING diff --git a/Documentation/driver-model/platform.txt b/Documentation/driver-model/platform.txt index 2a97320ee17f..83009fdcbbc8 100644 --- a/Documentation/driver-model/platform.txt +++ b/Documentation/driver-model/platform.txt @@ -122,15 +122,15 @@ None the less, there are some APIs to support such legacy drivers. Avoid using these calls except with such hotplug-deficient drivers. struct platform_device *platform_device_alloc( - char *name, unsigned id); + const char *name, int id); You can use platform_device_alloc() to dynamically allocate a device, which you will then initialize with resources and platform_device_register(). A better solution is usually: struct platform_device *platform_device_register_simple( - char *name, unsigned id, - struct resource *res, unsigned nres); + const char *name, int id, + struct resource *res, unsigned int nres); You can use platform_device_register_simple() as a one-step call to allocate and register a device. diff --git a/Documentation/dvb/bt8xx.txt b/Documentation/dvb/bt8xx.txt index ecb47adda063..b7b1d1b1da46 100644 --- a/Documentation/dvb/bt8xx.txt +++ b/Documentation/dvb/bt8xx.txt @@ -78,6 +78,18 @@ Example: For a full list of card ID's please see Documentation/video4linux/CARDLIST.bttv. In case of further problems please subscribe and send questions to the mailing list: linux-dvb@linuxtv.org. +2c) Probing the cards with broken PCI subsystem ID +-------------------------------------------------- +There are some TwinHan cards that the EEPROM has become corrupted for some +reason. The cards do not have correct PCI subsystem ID. But we can force +probing the cards with broken PCI subsystem ID + + $ echo 109e 0878 $subvendor $subdevice > \ + /sys/bus/pci/drivers/bt878/new_id + +109e: PCI_VENDOR_ID_BROOKTREE +0878: PCI_DEVICE_ID_BROOKTREE_878 + Authors: Richard Walker, Jamie Honan, Michael Hunold, diff --git a/Documentation/fb/deferred_io.txt b/Documentation/fb/deferred_io.txt index 63883a892120..748328370250 100644 --- a/Documentation/fb/deferred_io.txt +++ b/Documentation/fb/deferred_io.txt @@ -7,10 +7,10 @@ IO. The following example may be a useful explanation of how one such setup works: - userspace app like Xfbdev mmaps framebuffer -- deferred IO and driver sets up nopage and page_mkwrite handlers +- deferred IO and driver sets up fault and page_mkwrite handlers - userspace app tries to write to mmaped vaddress -- we get pagefault and reach nopage handler -- nopage handler finds and returns physical page +- we get pagefault and reach fault handler +- fault handler finds and returns physical page - we get page_mkwrite where we add this page to a list - schedule a workqueue task to be run after a delay - app continues writing to that page with no additional cost. this is diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 20c4c8bac9d7..68ce1300a360 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -156,22 +156,6 @@ Who: Arjan van de Ven <arjan@linux.intel.com> --------------------------- -What: USB driver API moves to EXPORT_SYMBOL_GPL -When: February 2008 -Files: include/linux/usb.h, drivers/usb/core/driver.c -Why: The USB subsystem has changed a lot over time, and it has been - possible to create userspace USB drivers using usbfs/libusb/gadgetfs - that operate as fast as the USB bus allows. Because of this, the USB - subsystem will not be allowing closed source kernel drivers to - register with it, after this grace period is over. If anyone needs - any help in converting their closed source drivers over to use the - userspace filesystems, please contact the - linux-usb-devel@lists.sourceforge.net mailing list, and the developers - there will be glad to help you out. -Who: Greg Kroah-Hartman <gregkh@suse.de> - ---------------------------- - What: vm_ops.nopage When: Soon, provided in-kernel callers have been converted Why: This interface is replaced by vm_ops.fault, but it has been around @@ -191,15 +175,6 @@ Who: Kay Sievers <kay.sievers@suse.de> --------------------------- -What: i2c_adapter.list -When: July 2007 -Why: Superfluous, this list duplicates the one maintained by the driver - core. -Who: Jean Delvare <khali@linux-fr.org>, - David Brownell <dbrownell@users.sourceforge.net> - ---------------------------- - What: ACPI procfs interface When: July 2008 Why: ACPI sysfs conversion should be finished by January 2008. @@ -225,14 +200,6 @@ Who: Len Brown <len.brown@intel.com> --------------------------- -What: i2c-ixp2000, i2c-ixp4xx and scx200_i2c drivers -When: September 2007 -Why: Obsolete. The new i2c-gpio driver replaces all hardware-specific - I2C-over-GPIO drivers. -Who: Jean Delvare <khali@linux-fr.org> - ---------------------------- - What: 'time' kernel boot parameter When: January 2008 Why: replaced by 'printk.time=<value>' so that printk timestamps can be @@ -241,13 +208,6 @@ Who: Randy Dunlap <randy.dunlap@oracle.com> --------------------------- -What: drivers depending on OSS_OBSOLETE -When: options in 2.6.23, code in 2.6.25 -Why: obsolete OSS drivers -Who: Adrian Bunk <bunk@stusta.de> - ---------------------------- - What: libata spindown skipping and warning When: Dec 2008 Why: Some halt(8) implementations synchronize caches for and spin @@ -266,22 +226,6 @@ Who: Tejun Heo <htejun@gmail.com> --------------------------- -What: Legacy RTC drivers (under drivers/i2c/chips) -When: November 2007 -Why: Obsolete. We have a RTC subsystem with better drivers. -Who: Jean Delvare <khali@linux-fr.org> - ---------------------------- - -What: iptables SAME target -When: 1.1. 2008 -Files: net/ipv4/netfilter/ipt_SAME.c, include/linux/netfilter_ipv4/ipt_SAME.h -Why: Obsolete for multiple years now, NAT core provides the same behaviour. - Unfixable broken wrt. 32/64 bit cleanness. -Who: Patrick McHardy <kaber@trash.net> - ---------------------------- - What: The arch/ppc and include/asm-ppc directories When: Jun 2008 Why: The arch/powerpc tree is the merged architecture for ppc32 and ppc64 @@ -295,16 +239,6 @@ Who: linuxppc-dev@ozlabs.org --------------------------- -What: mthca driver's MSI support -When: January 2008 -Files: drivers/infiniband/hw/mthca/*.[ch] -Why: All mthca hardware also supports MSI-X, which provides - strictly more functionality than MSI. So there is no point in - having both MSI-X and MSI support in the driver. -Who: Roland Dreier <rolandd@cisco.com> - ---------------------------- - What: sk98lin network driver When: Feburary 2008 Why: In kernel tree version of driver is unmaintained. Sk98lin driver @@ -323,13 +257,77 @@ Who: Thomas Gleixner <tglx@linutronix.de> --------------------------- -What: shaper network driver -When: January 2008 -Files: drivers/net/shaper.c, include/linux/if_shaper.h -Why: This driver has been marked obsolete for many years. - It was only designed to work on lower speed links and has design - flaws that lead to machine crashes. The qdisc infrastructure in - 2.4 or later kernels, provides richer features and is more robust. -Who: Stephen Hemminger <shemminger@linux-foundation.org> +--------------------------- + +What: i2c-i810, i2c-prosavage and i2c-savage4 +When: May 2008 +Why: These drivers are superseded by i810fb, intelfb and savagefb. +Who: Jean Delvare <khali@linux-fr.org> + +--------------------------- + +What: bcm43xx wireless network driver +When: 2.6.26 +Files: drivers/net/wireless/bcm43xx +Why: This driver's functionality has been replaced by the + mac80211-based b43 and b43legacy drivers. +Who: John W. Linville <linville@tuxdriver.com> + +--------------------------- + +What: ieee80211 softmac wireless networking component +When: 2.6.26 (or after removal of bcm43xx and port of zd1211rw to mac80211) +Files: net/ieee80211/softmac +Why: No in-kernel drivers will depend on it any longer. +Who: John W. Linville <linville@tuxdriver.com> + +--------------------------- + +What: rc80211-simple rate control algorithm for mac80211 +When: 2.6.26 +Files: net/mac80211/rc80211-simple.c +Why: This algorithm was provided for reference but always exhibited bad + responsiveness and performance and has some serious flaws. It has been + replaced by rc80211-pid. +Who: Stefano Brivio <stefano.brivio@polimi.it> --------------------------- + +What (Why): + - include/linux/netfilter_ipv4/ipt_TOS.h ipt_tos.h header files + (superseded by xt_TOS/xt_tos target & match) + + - "forwarding" header files like ipt_mac.h in + include/linux/netfilter_ipv4/ and include/linux/netfilter_ipv6/ + + - xt_CONNMARK match revision 0 + (superseded by xt_CONNMARK match revision 1) + + - xt_MARK target revisions 0 and 1 + (superseded by xt_MARK match revision 2) + + - xt_connmark match revision 0 + (superseded by xt_connmark match revision 1) + + - xt_conntrack match revision 0 + (superseded by xt_conntrack match revision 1) + + - xt_iprange match revision 0, + include/linux/netfilter_ipv4/ipt_iprange.h + (superseded by xt_iprange match revision 1) + + - xt_mark match revision 0 + (superseded by xt_mark match revision 1) + +When: January 2009 or Linux 2.7.0, whichever comes first +Why: Superseded by newer revisions or modules +Who: Jan Engelhardt <jengelh@computergmbh.de> + +--------------------------- + +What: b43 support for firmware revision < 410 +When: July 2008 +Why: The support code for the old firmware hurts code readability/maintainability + and slightly hurts runtime performance. Bugfixes for the old firmware + are not provided by Broadcom anymore. +Who: Michael Buesch <mb@bu3sch.de> diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt index d1b98257d000..44c97e6accb2 100644 --- a/Documentation/filesystems/configfs/configfs.txt +++ b/Documentation/filesystems/configfs/configfs.txt @@ -377,7 +377,7 @@ more explicit to have a method whereby userspace sees this divergence. Rather than have a group where some items behave differently than others, configfs provides a method whereby one or many subgroups are automatically created inside the parent at its creation. Thus, -mkdir("parent) results in "parent", "parent/subgroup1", up through +mkdir("parent") results in "parent", "parent/subgroup1", up through "parent/subgroupN". Items of type 1 can now be created in "parent/subgroup1", and items of type N can be created in "parent/subgroupN". diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 6a4adcae9f9a..560f88dc7090 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -86,9 +86,21 @@ Alex is working on a new set of patches right now. When mounting an ext4 filesystem, the following option are accepted: (*) == default -extents ext4 will use extents to address file data. The +extents (*) ext4 will use extents to address file data. The file system will no longer be mountable by ext3. +noextents ext4 will not use extents for newly created files + +journal_checksum Enable checksumming of the journal transactions. + This will allow the recovery code in e2fsck and the + kernel to detect corruption in the kernel. It is a + compatible change and will be ignored by older kernels. + +journal_async_commit Commit block can be written to disk without waiting + for descriptor blocks. If enabled older kernels cannot + mount the device. This will enable 'journal_checksum' + internally. + journal=update Update the ext4 file system's journal to the current format. @@ -196,6 +208,12 @@ nobh (a) cache disk block mapping information "nobh" option tries to avoid associating buffer heads (supported only for "writeback" mode). +mballoc (*) Use the multiple block allocator for block allocation +nomballoc disabled multiple block allocator for block allocation. +stripe=n Number of filesystem blocks that mballoc will try + to use for allocation size and alignment. For RAID5/6 + systems this should be the number of data + disks * RAID chunk size in file system blocks. Data Mode --------- diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt index ed55238023a9..c318a8bbb1ef 100644 --- a/Documentation/filesystems/ocfs2.txt +++ b/Documentation/filesystems/ocfs2.txt @@ -35,7 +35,6 @@ Features which OCFS2 does not support yet: - Directory change notification (F_NOTIFY) - Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease) - POSIX ACLs - - readpages / writepages (not user visible) Mount options ============= @@ -62,3 +61,18 @@ data=writeback Data ordering is not preserved, data may be written preferred_slot=0(*) During mount, try to use this filesystem slot first. If it is in use by another node, the first empty one found will be chosen. Invalid values will be ignored. +commit=nrsec (*) Ocfs2 can be told to sync all its data and metadata + every 'nrsec' seconds. The default value is 5 seconds. + This means that if you lose your power, you will lose + as much as the latest 5 seconds of work (your + filesystem will not be damaged though, thanks to the + journaling). This default value (or any low value) + will hurt performance, but it's good for data-safety. + Setting it to 0 will have the same effect as leaving + it at the default (5 seconds). + Setting it to very large values will improve + performance. +localalloc=8(*) Allows custom localalloc size in MB. If the value is too + large, the fs will silently revert it to the default. + Localalloc is not enabled for local mounts. +localflocks This disables cluster aware flock. diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index dac45c92d872..0f33c77bc14b 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -1,6 +1,6 @@ Changes since 2.5.0: ---- +--- [recommended] New helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(), @@ -10,7 +10,7 @@ Use them. (sb_find_get_block() replaces 2.4's get_hash_table()) ---- +--- [recommended] New methods: ->alloc_inode() and ->destroy_inode(). @@ -28,7 +28,7 @@ Declare Use FOO_I(inode) instead of &inode->u.foo_inode_i; -Add foo_alloc_inode() and foo_destory_inode() - the former should allocate +Add foo_alloc_inode() and foo_destroy_inode() - the former should allocate foo_inode_info and return the address of ->vfs_inode, the latter should free FOO_I(inode) (see in-tree filesystems for examples). diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index dec99455321f..5681e2fa1496 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -216,6 +216,7 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3) priority priority level nice nice level num_threads number of threads + it_real_value (obsolete, always 0) start_time time the process started after system boot vsize virtual memory size rss resident set memory size @@ -857,6 +858,45 @@ CPUs. The "procs_blocked" line gives the number of processes currently blocked, waiting for I/O to complete. +1.9 Ext4 file system parameters +------------------------------ +Ext4 file system have one directory per partition under /proc/fs/ext4/ +# ls /proc/fs/ext4/hdc/ +group_prealloc max_to_scan mb_groups mb_history min_to_scan order2_req +stats stream_req + +mb_groups: +This file gives the details of mutiblock allocator buddy cache of free blocks + +mb_history: +Multiblock allocation history. + +stats: +This file indicate whether the multiblock allocator should start collecting +statistics. The statistics are shown during unmount + +group_prealloc: +The multiblock allocator normalize the block allocation request to +group_prealloc filesystem blocks if we don't have strip value set. +The stripe value can be specified at mount time or during mke2fs. + +max_to_scan: +How long multiblock allocator can look for a best extent (in found extents) + +min_to_scan: +How long multiblock allocator must look for a best extent + +order2_req: +Multiblock allocator use 2^N search using buddies only for requests greater +than or equal to order2_req. The request size is specfied in file system +blocks. A value of 2 indicate only if the requests are greater than or equal +to 4 blocks. + +stream_req: +Files smaller than stream_req are served by the stream allocator, whose +purpose is to pack requests as close each to other as possible to +produce smooth I/O traffic. Avalue of 16 indicate that file smaller than 16 +filesystem block size will use group based preallocation. ------------------------------------------------------------------------------ Summary @@ -989,6 +1029,14 @@ nr_inodes Denotes the number of inodes the system has allocated. This number will grow and shrink dynamically. +nr_open +------- + +Denotes the maximum number of file-handles a process can +allocate. Default value is 1024*1024 (1048576) which should be +enough for most machines. Actual limit depends on RLIMIT_NOFILE +resource limit. + nr_free_inodes -------------- @@ -1095,13 +1143,6 @@ check the amount of free space (value is in seconds). Default settings are: 4, resume it if we have a value of 3 or more percent; consider information about the amount of free space valid for 30 seconds -audit_argv_kb -------------- - -The file contains a single value denoting the limit on the argv array size -for execve (in KiB). This limit is only applied when system call auditing for -execve is enabled, otherwise the value is ignored. - ctrl-alt-del ------------ @@ -1282,13 +1323,28 @@ for writeout by the pdflush daemons. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a pdflush daemon wakes up. +highmem_is_dirtyable +-------------------- + +Only present if CONFIG_HIGHMEM is set. + +This defaults to 0 (false), meaning that the ratios set above are calculated +as a percentage of lowmem only. This protects against excessive scanning +in page reclaim, swapping and general VM distress. + +Setting this to 1 can be useful on 32 bit machines where you want to make +random changes within an MMAPed file that is larger than your available +lowmem without causing large quantities of random IO. Is is safe if the +behavior of all programs running on the machine is known and memory will +not be otherwise stressed. + legacy_va_layout ---------------- If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel will use the legacy (2.4) layout for all processes. -lower_zone_protection +lowmem_reserve_ratio --------------------- For some specialised workloads on highmem machines it is dangerous for @@ -1308,25 +1364,71 @@ captured into pinned user memory. mechanism will also defend that region from allocations which could use highmem or lowmem). -The `lower_zone_protection' tunable determines how aggressive the kernel is -in defending these lower zones. The default value is zero - no -protection at all. +The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is +in defending these lower zones. If you have a machine which uses highmem or ISA DMA and your applications are using mlock(), or if you are running with no swap then -you probably should increase the lower_zone_protection setting. - -The units of this tunable are fairly vague. It is approximately equal -to "megabytes," so setting lower_zone_protection=100 will protect around 100 -megabytes of the lowmem zone from user allocations. It will also make -those 100 megabytes unavailable for use by applications and by -pagecache, so there is a cost. - -The effects of this tunable may be observed by monitoring -/proc/meminfo:LowFree. Write a single huge file and observe the point -at which LowFree ceases to fall. - -A reasonable value for lower_zone_protection is 100. +you probably should change the lowmem_reserve_ratio setting. + +The lowmem_reserve_ratio is an array. You can see them by reading this file. +- +% cat /proc/sys/vm/lowmem_reserve_ratio +256 256 32 +- +Note: # of this elements is one fewer than number of zones. Because the highest + zone's value is not necessary for following calculation. + +But, these values are not used directly. The kernel calculates # of protection +pages for each zones from them. These are shown as array of protection pages +in /proc/zoneinfo like followings. (This is an example of x86-64 box). +Each zone has an array of protection pages like this. + +- +Node 0, zone DMA + pages free 1355 + min 3 + low 3 + high 4 + : + : + numa_other 0 + protection: (0, 2004, 2004, 2004) + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + pagesets + cpu: 0 pcp: 0 + : +- +These protections are added to score to judge whether this zone should be used +for page allocation or should be reclaimed. + +In this example, if normal pages (index=2) are required to this DMA zone and +pages_high is used for watermark, the kernel judges this zone should not be +used because pages_free(1355) is smaller than watermark + protection[2] +(4 + 2004 = 2008). If this protection value is 0, this zone would be used for +normal page requirement. If requirement is DMA zone(index=0), protection[0] +(=0) is used. + +zone[i]'s protection[j] is calculated by following exprssion. + +(i < j): + zone[i]->protection[j] + = (total sums of present_pages from zone[i+1] to zone[j] on the node) + / lowmem_reserve_ratio[i]; +(i = j): + (should not be protected. = 0; +(i > j): + (not necessary, but looks 0) + +The default values of lowmem_reserve_ratio[i] are + 256 (if zone[i] means DMA or DMA32 zone) + 32 (others). +As above expression, they are reciprocal number of ratio. +256 means 1/256. # of protection pages becomes about "0.39%" of total present +pages of higher zones on the node. + +If you would like to protect more pages, smaller values are effective. +The minimum value is 1 (1/1 -> 100%). page-cluster ------------ @@ -1880,11 +1982,6 @@ max_size Maximum size of the routing cache. Old entries will be purged once the cache reached has this size. -max_delay, min_delay --------------------- - -Delays for flushing the routing cache. - redirect_load, redirect_number ------------------------------ diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.txt b/Documentation/filesystems/ramfs-rootfs-initramfs.txt index 339c6a4f220e..7be232b44ee4 100644 --- a/Documentation/filesystems/ramfs-rootfs-initramfs.txt +++ b/Documentation/filesystems/ramfs-rootfs-initramfs.txt @@ -118,7 +118,7 @@ All this differs from the old initrd in several ways: with the new root (cd /newmount; mount --move . /; chroot .), attach stdin/stdout/stderr to the new /dev/console, and exec the new init. - Since this is a remarkably persnickity process (and involves deleting + Since this is a remarkably persnickety process (and involves deleting commands before you can run them), the klibc package introduced a helper program (utils/run_init.c) to do all this for you. Most other packages (such as busybox) have named this command "switch_root". diff --git a/Documentation/filesystems/relay.txt b/Documentation/filesystems/relay.txt index 18d23f9a18c7..094f2d2f38b1 100644 --- a/Documentation/filesystems/relay.txt +++ b/Documentation/filesystems/relay.txt @@ -140,7 +140,7 @@ close() decrements the channel buffer's refcount. When the refcount In order for a user application to make use of relay files, the host filesystem must be mounted. For example, - mount -t debugfs debugfs /debug + mount -t debugfs debugfs /sys/kernel/debug NOTE: the host filesystem doesn't need to be mounted for kernel clients to create or use channels - it only needs to be diff --git a/Documentation/fujitsu/frv/README.txt b/Documentation/frv/README.txt index a984faa968e8..a984faa968e8 100644 --- a/Documentation/fujitsu/frv/README.txt +++ b/Documentation/frv/README.txt diff --git a/Documentation/fujitsu/frv/atomic-ops.txt b/Documentation/frv/atomic-ops.txt index 96638e9b9fe0..96638e9b9fe0 100644 --- a/Documentation/fujitsu/frv/atomic-ops.txt +++ b/Documentation/frv/atomic-ops.txt diff --git a/Documentation/fujitsu/frv/booting.txt b/Documentation/frv/booting.txt index 4e229056ef22..ace200b7c214 100644 --- a/Documentation/fujitsu/frv/booting.txt +++ b/Documentation/frv/booting.txt @@ -177,5 +177,5 @@ separated by spaces: (*) vdc=... This option configures the MB93493 companion chip visual display - driver. Please see Documentation/fujitsu/mb93493/vdc.txt for more + driver. Please see Documentation/frv/mb93493/vdc.txt for more information. diff --git a/Documentation/fujitsu/frv/clock.txt b/Documentation/frv/clock.txt index c72d350e177a..c72d350e177a 100644 --- a/Documentation/fujitsu/frv/clock.txt +++ b/Documentation/frv/clock.txt diff --git a/Documentation/fujitsu/frv/configuring.txt b/Documentation/frv/configuring.txt index 36e76a2336fa..36e76a2336fa 100644 --- a/Documentation/fujitsu/frv/configuring.txt +++ b/Documentation/frv/configuring.txt diff --git a/Documentation/fujitsu/frv/features.txt b/Documentation/frv/features.txt index fa20c0e72833..fa20c0e72833 100644 --- a/Documentation/fujitsu/frv/features.txt +++ b/Documentation/frv/features.txt diff --git a/Documentation/fujitsu/frv/gdbinit b/Documentation/frv/gdbinit index 51517b6f307f..51517b6f307f 100644 --- a/Documentation/fujitsu/frv/gdbinit +++ b/Documentation/frv/gdbinit diff --git a/Documentation/fujitsu/frv/gdbstub.txt b/Documentation/frv/gdbstub.txt index b92bfd902a4e..b92bfd902a4e 100644 --- a/Documentation/fujitsu/frv/gdbstub.txt +++ b/Documentation/frv/gdbstub.txt diff --git a/Documentation/fujitsu/frv/kernel-ABI.txt b/Documentation/frv/kernel-ABI.txt index aaa1cec86f0b..aaa1cec86f0b 100644 --- a/Documentation/fujitsu/frv/kernel-ABI.txt +++ b/Documentation/frv/kernel-ABI.txt diff --git a/Documentation/fujitsu/frv/mmu-layout.txt b/Documentation/frv/mmu-layout.txt index db10250df6be..db10250df6be 100644 --- a/Documentation/fujitsu/frv/mmu-layout.txt +++ b/Documentation/frv/mmu-layout.txt diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 6bc2ba215df9..8da724e2a0ff 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -32,7 +32,7 @@ The exact capabilities of GPIOs vary between systems. Common options: - Input values are likewise readable (1, 0). Some chips support readback of pins configured as "output", which is very useful in such "wire-OR" cases (to support bidirectional signaling). GPIO controllers may have - input de-glitch logic, sometimes with software controls. + input de-glitch/debounce logic, sometimes with software controls. - Inputs can often be used as IRQ signals, often edge triggered but sometimes level triggered. Such IRQs may be configurable as system @@ -60,10 +60,13 @@ used on a board that's wired differently. Only least-common-denominator functionality can be very portable. Other features are platform-specific, and that can be critical for glue logic. -Plus, this doesn't define an implementation framework, just an interface. +Plus, this doesn't require any implementation framework, just an interface. One platform might implement it as simple inline functions accessing chip registers; another might implement it by delegating through abstractions -used for several very different kinds of GPIO controller. +used for several very different kinds of GPIO controller. (There is some +optional code supporting such an implementation strategy, described later +in this document, but drivers acting as clients to the GPIO interface must +not care how it's implemented.) That said, if the convention is supported on their platform, drivers should use it when possible. Platforms should declare GENERIC_GPIO support in @@ -121,6 +124,11 @@ before tasking is enabled, as part of early board setup. For output GPIOs, the value provided becomes the initial output value. This helps avoid signal glitching during system startup. +For compatibility with legacy interfaces to GPIOs, setting the direction +of a GPIO implicitly requests that GPIO (see below) if it has not been +requested already. That compatibility may be removed in the future; +explicitly requesting GPIOs is strongly preferred. + Setting the direction can fail if the GPIO number is invalid, or when that particular GPIO can't be used in that mode. It's generally a bad idea to rely on boot firmware to have set the direction correctly, since @@ -133,6 +141,7 @@ Spinlock-Safe GPIO access ------------------------- Most GPIO controllers can be accessed with memory read/write instructions. That doesn't need to sleep, and can safely be done from inside IRQ handlers. +(That includes hardirq contexts on RT kernels.) Use these calls to access such GPIOs: @@ -145,7 +154,7 @@ Use these calls to access such GPIOs: The values are boolean, zero for low, nonzero for high. When reading the value of an output pin, the value returned should be what's seen on the pin ... that won't always match the specified output value, because of -issues including wire-OR and output latencies. +issues including open-drain signaling and output latencies. The get/set calls have no error returns because "invalid GPIO" should have been reported earlier from gpio_direction_*(). However, note that not all @@ -170,7 +179,8 @@ get to the head of a queue to transmit a command and get its response. This requires sleeping, which can't be done from inside IRQ handlers. Platforms that support this type of GPIO distinguish them from other GPIOs -by returning nonzero from this call: +by returning nonzero from this call (which requires a valid GPIO number, +either explicitly or implicitly requested): int gpio_cansleep(unsigned gpio); @@ -209,8 +219,11 @@ before tasking is enabled, as part of early board setup. These calls serve two basic purposes. One is marking the signals which are actually in use as GPIOs, for better diagnostics; systems may have several hundred potential GPIOs, but often only a dozen are used on any -given board. Another is to catch conflicts between drivers, reporting -errors when drivers wrongly think they have exclusive use of that signal. +given board. Another is to catch conflicts, identifying errors when +(a) two or more drivers wrongly think they have exclusive use of that +signal, or (b) something wrongly believes it's safe to remove drivers +needed to manage a signal that's in active use. That is, requesting a +GPIO can serve as a kind of lock. These two calls are optional because not not all current Linux platforms offer such functionality in their GPIO support; a valid implementation @@ -223,6 +236,9 @@ Note that requesting a GPIO does NOT cause it to be configured in any way; it just marks that GPIO as in use. Separate code must handle any pin setup (e.g. controlling which pin the GPIO uses, pullup/pulldown). +Also note that it's your responsibility to have stopped using a GPIO +before you free it. + GPIOs mapped to IRQs -------------------- @@ -238,7 +254,7 @@ map between them using calls like: Those return either the corresponding number in the other namespace, or else a negative errno code if the mapping can't be done. (For example, -some GPIOs can't used as IRQs.) It is an unchecked error to use a GPIO +some GPIOs can't be used as IRQs.) It is an unchecked error to use a GPIO number that wasn't set up as an input using gpio_direction_input(), or to use an IRQ number that didn't originally come from gpio_to_irq(). @@ -299,17 +315,110 @@ Related to multiplexing is configuration and enabling of the pullups or pulldowns integrated on some platforms. Not all platforms support them, or support them in the same way; and any given board might use external pullups (or pulldowns) so that the on-chip ones should not be used. +(When a circuit needs 5 kOhm, on-chip 100 kOhm resistors won't do.) There are other system-specific mechanisms that are not specified here, like the aforementioned options for input de-glitching and wire-OR output. Hardware may support reading or writing GPIOs in gangs, but that's usually configuration dependent: for GPIOs sharing the same bank. (GPIOs are commonly grouped in banks of 16 or 32, with a given SOC having several such -banks.) Some systems can trigger IRQs from output GPIOs. Code relying on -such mechanisms will necessarily be nonportable. +banks.) Some systems can trigger IRQs from output GPIOs, or read values +from pins not managed as GPIOs. Code relying on such mechanisms will +necessarily be nonportable. -Dynamic definition of GPIOs is not currently supported; for example, as +Dynamic definition of GPIOs is not currently standard; for example, as a side effect of configuring an add-on board with some GPIO expanders. These calls are purely for kernel space, but a userspace API could be built -on top of it. +on top of them. + + +GPIO implementor's framework (OPTIONAL) +======================================= +As noted earlier, there is an optional implementation framework making it +easier for platforms to support different kinds of GPIO controller using +the same programming interface. + +As a debugging aid, if debugfs is available a /sys/kernel/debug/gpio file +will be found there. That will list all the controllers registered through +this framework, and the state of the GPIOs currently in use. + + +Controller Drivers: gpio_chip +----------------------------- +In this framework each GPIO controller is packaged as a "struct gpio_chip" +with information common to each controller of that type: + + - methods to establish GPIO direction + - methods used to access GPIO values + - flag saying whether calls to its methods may sleep + - optional debugfs dump method (showing extra state like pullup config) + - label for diagnostics + +There is also per-instance data, which may come from device.platform_data: +the number of its first GPIO, and how many GPIOs it exposes. + +The code implementing a gpio_chip should support multiple instances of the +controller, possibly using the driver model. That code will configure each +gpio_chip and issue gpiochip_add(). Removing a GPIO controller should be +rare; use gpiochip_remove() when it is unavoidable. + +Most often a gpio_chip is part of an instance-specific structure with state +not exposed by the GPIO interfaces, such as addressing, power management, +and more. Chips such as codecs will have complex non-GPIO state, + +Any debugfs dump method should normally ignore signals which haven't been +requested as GPIOs. They can use gpiochip_is_requested(), which returns +either NULL or the label associated with that GPIO when it was requested. + + +Platform Support +---------------- +To support this framework, a platform's Kconfig will "select HAVE_GPIO_LIB" +and arrange that its <asm/gpio.h> includes <asm-generic/gpio.h> and defines +three functions: gpio_get_value(), gpio_set_value(), and gpio_cansleep(). +They may also want to provide a custom value for ARCH_NR_GPIOS. + +Trivial implementations of those functions can directly use framework +code, which always dispatches through the gpio_chip: + + #define gpio_get_value __gpio_get_value + #define gpio_set_value __gpio_set_value + #define gpio_cansleep __gpio_cansleep + +Fancier implementations could instead define those as inline functions with +logic optimizing access to specific SOC-based GPIOs. For example, if the +referenced GPIO is the constant "12", getting or setting its value could +cost as little as two or three instructions, never sleeping. When such an +optimization is not possible those calls must delegate to the framework +code, costing at least a few dozen instructions. For bitbanged I/O, such +instruction savings can be significant. + +For SOCs, platform-specific code defines and registers gpio_chip instances +for each bank of on-chip GPIOs. Those GPIOs should be numbered/labeled to +match chip vendor documentation, and directly match board schematics. They +may well start at zero and go up to a platform-specific limit. Such GPIOs +are normally integrated into platform initialization to make them always be +available, from arch_initcall() or earlier; they can often serve as IRQs. + + +Board Support +------------- +For external GPIO controllers -- such as I2C or SPI expanders, ASICs, multi +function devices, FPGAs or CPLDs -- most often board-specific code handles +registering controller devices and ensures that their drivers know what GPIO +numbers to use with gpiochip_add(). Their numbers often start right after +platform-specific GPIOs. + +For example, board setup code could create structures identifying the range +of GPIOs that chip will expose, and passes them to each GPIO expander chip +using platform_data. Then the chip driver's probe() routine could pass that +data to gpiochip_add(). + +Initialization order can be important. For example, when a device relies on +an I2C-based GPIO, its probe() routine should only be called after that GPIO +becomes available. That may mean the device should not be registered until +calls for that GPIO can work. One way to address such dependencies is for +such gpio_chip controllers to provide setup() and teardown() callbacks to +board specific code; those board specific callbacks would register devices +once all the necessary resources are available. diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index fde4420e3f75..3bd958360159 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -17,9 +17,8 @@ Supported adapters: Datasheets: Publicly available at the Intel website Authors: - Frodo Looijaard <frodol@dds.nl>, - Philip Edelbrock <phil@netroedge.com>, Mark Studebaker <mdsxyz123@yahoo.com> + Jean Delvare <khali@linux-fr.org> Module Parameters @@ -62,7 +61,7 @@ Not supported. I2C Block Read Support ---------------------- -Not supported at the moment. +I2C block read is supported on the 82801EB (ICH5) and later chips. SMBus 2.0 Support diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro index 06b4be3ef6d8..1405fb69984c 100644 --- a/Documentation/i2c/busses/i2c-viapro +++ b/Documentation/i2c/busses/i2c-viapro @@ -10,7 +10,7 @@ Supported adapters: * VIA Technologies, Inc. VT8231, VT8233, VT8233A Datasheet: available on request from VIA - * VIA Technologies, Inc. VT8235, VT8237R, VT8237A, VT8251 + * VIA Technologies, Inc. VT8235, VT8237R, VT8237A, VT8237S, VT8251 Datasheet: available on request and under NDA from VIA * VIA Technologies, Inc. CX700 @@ -46,6 +46,7 @@ Your lspci -n listing must show one of these : device 1106:3177 (VT8235) device 1106:3227 (VT8237R) device 1106:3337 (VT8237A) + device 1106:3372 (VT8237S) device 1106:3287 (VT8251) device 1106:8324 (CX700) diff --git a/Documentation/i2c/chips/pca9539 b/Documentation/i2c/chips/pca9539 index c4fce6a13537..1d81c530c4a5 100644 --- a/Documentation/i2c/chips/pca9539 +++ b/Documentation/i2c/chips/pca9539 @@ -1,6 +1,9 @@ Kernel driver pca9539 ===================== +NOTE: this driver is deprecated and will be dropped soon, use +drivers/gpio/pca9539.c instead. + Supported chips: * Philips PCA9539 Prefix: 'pca9539' diff --git a/Documentation/i2c/chips/pcf8575 b/Documentation/i2c/chips/pcf8575 new file mode 100644 index 000000000000..25f5698a61cf --- /dev/null +++ b/Documentation/i2c/chips/pcf8575 @@ -0,0 +1,72 @@ +About the PCF8575 chip and the pcf8575 kernel driver +==================================================== + +The PCF8575 chip is produced by the following manufacturers: + + * Philips NXP + http://www.nxp.com/#/pip/cb=[type=product,path=50807/41735/41850,final=PCF8575_3]|pip=[pip=PCF8575_3][0] + + * Texas Instruments + http://focus.ti.com/docs/prod/folders/print/pcf8575.html + + +Some vendors sell small PCB's with the PCF8575 mounted on it. You can connect +such a board to a Linux host via e.g. an USB to I2C interface. Examples of +PCB boards with a PCF8575: + + * SFE Breakout Board for PCF8575 I2C Expander by RobotShop + http://www.robotshop.ca/home/products/robot-parts/electronics/adapters-converters/sfe-pcf8575-i2c-expander-board.html + + * Breakout Board for PCF8575 I2C Expander by Spark Fun Electronics + http://www.sparkfun.com/commerce/product_info.php?products_id=8130 + + +Description +----------- +The PCF8575 chip is a 16-bit I/O expander for the I2C bus. Up to eight of +these chips can be connected to the same I2C bus. You can find this +chip on some custom designed hardware, but you won't find it on PC +motherboards. + +The PCF8575 chip consists of a 16-bit quasi-bidirectional port and an I2C-bus +interface. Each of the sixteen I/O's can be independently used as an input or +an output. To set up an I/O pin as an input, you have to write a 1 to the +corresponding output. + +For more information please see the datasheet. + + +Detection +--------- + +There is no method known to detect whether a chip on a given I2C address is +a PCF8575 or whether it is any other I2C device. So there are two alternatives +to let the driver find the installed PCF8575 devices: +- Load this driver after any other I2C driver for I2C devices with addresses + in the range 0x20 .. 0x27. +- Pass the I2C bus and address of the installed PCF8575 devices explicitly to + the driver at load time via the probe=... or force=... parameters. + +/sys interface +-------------- + +For each address on which a PCF8575 chip was found or forced the following +files will be created under /sys: +* /sys/bus/i2c/devices/<bus>-<address>/read +* /sys/bus/i2c/devices/<bus>-<address>/write +where bus is the I2C bus number (0, 1, ...) and address is the four-digit +hexadecimal representation of the 7-bit I2C address of the PCF8575 +(0020 .. 0027). + +The read file is read-only. Reading it will trigger an I2C read and will hence +report the current input state for the pins configured as inputs, and the +current output value for the pins configured as outputs. + +The write file is read-write. Writing a value to it will configure all pins +as output for which the corresponding bit is zero. Reading the write file will +return the value last written, or -EAGAIN if no value has yet been written to +the write file. + +On module initialization the configuration of the chip is not changed -- the +chip is left in the state it was already configured in through either power-up +or through previous I2C write actions. diff --git a/Documentation/i2c/i2c-stub b/Documentation/i2c/i2c-stub index 89e69ad3436c..0d8be1c20c16 100644 --- a/Documentation/i2c/i2c-stub +++ b/Documentation/i2c/i2c-stub @@ -25,6 +25,9 @@ The typical use-case is like this: 3. load the target sensors chip driver module 4. observe its behavior in the kernel log +There's a script named i2c-stub-from-dump in the i2c-tools package which +can load register values automatically from a chip dump. + PARAMETERS: int chip_addr[10]: @@ -32,9 +35,6 @@ int chip_addr[10]: CAVEATS: -There are independent arrays for byte/data and word/data commands. Depending -on if/how a target driver mixes them, you'll need to be careful. - If your target driver polls some byte or word waiting for it to change, the stub could lock it up. Use i2cset to unlock it. diff --git a/Documentation/i2c/summary b/Documentation/i2c/summary index 003c7319b8c7..13ab076dcd92 100644 --- a/Documentation/i2c/summary +++ b/Documentation/i2c/summary @@ -1,5 +1,3 @@ -This is an explanation of what i2c is, and what is supported in this package. - I2C and SMBus ============= @@ -33,52 +31,17 @@ When we talk about I2C, we use the following terms: Client An Algorithm driver contains general code that can be used for a whole class -of I2C adapters. Each specific adapter driver depends on one algorithm -driver. +of I2C adapters. Each specific adapter driver either depends on one algorithm +driver, or includes its own implementation. A Driver driver (yes, this sounds ridiculous, sorry) contains the general code to access some type of device. Each detected device gets its own data in the Client structure. Usually, Driver and Client are more closely integrated than Algorithm and Adapter. -For a given configuration, you will need a driver for your I2C bus (usually -a separate Adapter and Algorithm driver), and drivers for your I2C devices -(usually one driver for each device). There are no I2C device drivers -in this package. See the lm_sensors project http://www.lm-sensors.nu -for device drivers. +For a given configuration, you will need a driver for your I2C bus, and +drivers for your I2C devices (usually one driver for each device). At this time, Linux only operates I2C (or SMBus) in master mode; you can't use these APIs to make a Linux system behave as a slave/device, either to speak a custom protocol or to emulate some other device. - - -Included Bus Drivers -==================== -Note that only stable drivers are patched into the kernel by 'mkpatch'. - - -Base modules ------------- - -i2c-core: The basic I2C code, including the /proc/bus/i2c* interface -i2c-dev: The /dev/i2c-* interface -i2c-proc: The /proc/sys/dev/sensors interface for device (client) drivers - -Algorithm drivers ------------------ - -i2c-algo-bit: A bit-banging algorithm -i2c-algo-pcf: A PCF 8584 style algorithm -i2c-algo-ibm_ocp: An algorithm for the I2C device in IBM 4xx processors (NOT BUILT BY DEFAULT) - -Adapter drivers ---------------- - -i2c-elektor: Elektor ISA card (uses i2c-algo-pcf) -i2c-elv: ELV parallel port adapter (uses i2c-algo-bit) -i2c-pcf-epp: PCF8584 on a EPP parallel port (uses i2c-algo-pcf) (NOT mkpatched) -i2c-philips-par: Philips style parallel port adapter (uses i2c-algo-bit) -i2c-adap-ibm_ocp: IBM 4xx processor I2C device (uses i2c-algo-ibm_ocp) (NOT BUILT BY DEFAULT) -i2c-pport: Primitive parallel port adapter (uses i2c-algo-bit) -i2c-velleman: Velleman K8000 parallel port adapter (uses i2c-algo-bit) - diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 2c170032bf37..bfb0a5520817 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -267,9 +267,9 @@ insmod parameter of the form force_<kind>. Fortunately, as a module writer, you just have to define the `normal_i2c' parameter. The complete declaration could look like this: - /* Scan 0x37, and 0x48 to 0x4f */ - static unsigned short normal_i2c[] = { 0x37, 0x48, 0x49, 0x4a, 0x4b, 0x4c, - 0x4d, 0x4e, 0x4f, I2C_CLIENT_END }; + /* Scan 0x4c to 0x4f */ + static const unsigned short normal_i2c[] = { 0x4c, 0x4d, 0x4e, 0x4f, + I2C_CLIENT_END }; /* Magic definition of all other variables and things */ I2C_CLIENT_INSMOD; diff --git a/Documentation/ia64/aliasing-test.c b/Documentation/ia64/aliasing-test.c index 773a814d4093..d23610fb2ff9 100644 --- a/Documentation/ia64/aliasing-test.c +++ b/Documentation/ia64/aliasing-test.c @@ -16,6 +16,7 @@ #include <fcntl.h> #include <fnmatch.h> #include <string.h> +#include <sys/ioctl.h> #include <sys/mman.h> #include <sys/stat.h> #include <unistd.h> @@ -65,7 +66,7 @@ int scan_tree(char *path, char *file, off_t offset, size_t length, int touch) { struct dirent **namelist; char *name, *path2; - int i, n, r, rc, result = 0; + int i, n, r, rc = 0, result = 0; struct stat buf; n = scandir(path, &namelist, 0, alphasort); @@ -113,7 +114,7 @@ skip: free(namelist[i]); } free(namelist); - return rc; + return result; } char buf[1024]; @@ -149,7 +150,7 @@ int scan_rom(char *path, char *file) { struct dirent **namelist; char *name, *path2; - int i, n, r, rc, result = 0; + int i, n, r, rc = 0, result = 0; struct stat buf; n = scandir(path, &namelist, 0, alphasort); @@ -180,7 +181,7 @@ int scan_rom(char *path, char *file) * important thing is that no MCA happened. */ if (rc > 0) - fprintf(stderr, "PASS: %s read %ld bytes\n", path2, rc); + fprintf(stderr, "PASS: %s read %d bytes\n", path2, rc); else { fprintf(stderr, "PASS: %s not readable\n", path2); return rc; @@ -201,10 +202,10 @@ skip: free(namelist[i]); } free(namelist); - return rc; + return result; } -int main() +int main(void) { int rc; @@ -256,4 +257,6 @@ int main() scan_tree("/proc/bus/pci", "??.?", 0xA0000, 0x20000, 0); scan_tree("/proc/bus/pci", "??.?", 0xC0000, 0x40000, 1); scan_tree("/proc/bus/pci", "??.?", 0, 1024*1024, 0); + + return rc; } diff --git a/Documentation/ide.txt b/Documentation/ide.txt index 1d50f23a5cab..94e2e3b9e77f 100644 --- a/Documentation/ide.txt +++ b/Documentation/ide.txt @@ -30,7 +30,7 @@ *** *** The CMD640 is also used on some Vesa Local Bus (VLB) cards, and is *NOT* *** automatically detected by Linux. For safe, reliable operation with such -*** interfaces, one *MUST* use the "ide0=cmd640_vlb" kernel option. +*** interfaces, one *MUST* use the "cmd640.probe_vlb" kernel option. *** *** Use of the "serialize" option is no longer necessary. @@ -244,10 +244,6 @@ Summary of ide driver parameters for kernel command line "hdx=nodma" : disallow DMA - "hdx=swapdata" : when the drive is a disk, byte swap all data - - "hdx=bswap" : same as above.......... - "hdx=scsi" : the return of the ide-scsi flag, this is useful for allowing ide-floppy, ide-tape, and ide-cdrom|writers to use ide-scsi emulation on a device specific option. @@ -292,9 +288,6 @@ The following are valid ONLY on ide0, which usually corresponds to the first ATA interface found on the particular host, and the defaults for the base,ctl ports must not be altered. - "ide0=cmd640_vlb" : *REQUIRED* for VLB cards with the CMD640 chip - (not for PCI -- automatically detected) - "ide=doubler" : probe/support IDE doublers on Amiga There may be more options than shown -- use the source, Luke! @@ -310,6 +303,10 @@ i.e. to enable probing for ALI M14xx chipsets (ali14xx host driver) use: * "probe" module parameter when ali14xx driver is compiled as module ("modprobe ali14xx probe") +Also for legacy CMD640 host driver (cmd640) you need to use "probe_vlb" +kernel paremeter to enable probing for VLB version of the chipset (PCI ones +are detected automatically). + ================================================================================ IDE ATAPI streaming tape driver diff --git a/Documentation/ide/ChangeLog.ide-cd.1994-2004 b/Documentation/ide/ChangeLog.ide-cd.1994-2004 new file mode 100644 index 000000000000..190d17bfff62 --- /dev/null +++ b/Documentation/ide/ChangeLog.ide-cd.1994-2004 @@ -0,0 +1,268 @@ +/* + * 1.00 Oct 31, 1994 -- Initial version. + * 1.01 Nov 2, 1994 -- Fixed problem with starting request in + * cdrom_check_status. + * 1.03 Nov 25, 1994 -- leaving unmask_intr[] as a user-setting (as for disks) + * (from mlord) -- minor changes to cdrom_setup() + * -- renamed ide_dev_s to ide_drive_t, enable irq on command + * 2.00 Nov 27, 1994 -- Generalize packet command interface; + * add audio ioctls. + * 2.01 Dec 3, 1994 -- Rework packet command interface to handle devices + * which send an interrupt when ready for a command. + * 2.02 Dec 11, 1994 -- Cache the TOC in the driver. + * Don't use SCMD_PLAYAUDIO_TI; it's not included + * in the current version of ATAPI. + * Try to use LBA instead of track or MSF addressing + * when possible. + * Don't wait for READY_STAT. + * 2.03 Jan 10, 1995 -- Rewrite block read routines to handle block sizes + * other than 2k and to move multiple sectors in a + * single transaction. + * 2.04 Apr 21, 1995 -- Add work-around for Creative Labs CD220E drives. + * Thanks to Nick Saw <cwsaw@pts7.pts.mot.com> for + * help in figuring this out. Ditto for Acer and + * Aztech drives, which seem to have the same problem. + * 2.04b May 30, 1995 -- Fix to match changes in ide.c version 3.16 -ml + * 2.05 Jun 8, 1995 -- Don't attempt to retry after an illegal request + * or data protect error. + * Use HWIF and DEV_HWIF macros as in ide.c. + * Always try to do a request_sense after + * a failed command. + * Include an option to give textual descriptions + * of ATAPI errors. + * Fix a bug in handling the sector cache which + * showed up if the drive returned data in 512 byte + * blocks (like Pioneer drives). Thanks to + * Richard Hirst <srh@gpt.co.uk> for diagnosing this. + * Properly supply the page number field in the + * MODE_SELECT command. + * PLAYAUDIO12 is broken on the Aztech; work around it. + * 2.05x Aug 11, 1995 -- lots of data structure renaming/restructuring in ide.c + * (my apologies to Scott, but now ide-cd.c is independent) + * 3.00 Aug 22, 1995 -- Implement CDROMMULTISESSION ioctl. + * Implement CDROMREADAUDIO ioctl (UNTESTED). + * Use input_ide_data() and output_ide_data(). + * Add door locking. + * Fix usage count leak in cdrom_open, which happened + * when a read-write mount was attempted. + * Try to load the disk on open. + * Implement CDROMEJECT_SW ioctl (off by default). + * Read total cdrom capacity during open. + * Rearrange logic in cdrom_decode_status. Issue + * request sense commands for failed packet commands + * from here instead of from cdrom_queue_packet_command. + * Fix a race condition in retrieving error information. + * Suppress printing normal unit attention errors and + * some drive not ready errors. + * Implement CDROMVOLREAD ioctl. + * Implement CDROMREADMODE1/2 ioctls. + * Fix race condition in setting up interrupt handlers + * when the `serialize' option is used. + * 3.01 Sep 2, 1995 -- Fix ordering of reenabling interrupts in + * cdrom_queue_request. + * Another try at using ide_[input,output]_data. + * 3.02 Sep 16, 1995 -- Stick total disk capacity in partition table as well. + * Make VERBOSE_IDE_CD_ERRORS dump failed command again. + * Dump out more information for ILLEGAL REQUEST errs. + * Fix handling of errors occurring before the + * packet command is transferred. + * Fix transfers with odd bytelengths. + * 3.03 Oct 27, 1995 -- Some Creative drives have an id of just `CD'. + * `DCI-2S10' drives are broken too. + * 3.04 Nov 20, 1995 -- So are Vertos drives. + * 3.05 Dec 1, 1995 -- Changes to go with overhaul of ide.c and ide-tape.c + * 3.06 Dec 16, 1995 -- Add support needed for partitions. + * More workarounds for Vertos bugs (based on patches + * from Holger Dietze <dietze@aix520.informatik.uni-leipzig.de>). + * Try to eliminate byteorder assumptions. + * Use atapi_cdrom_subchnl struct definition. + * Add STANDARD_ATAPI compilation option. + * 3.07 Jan 29, 1996 -- More twiddling for broken drives: Sony 55D, + * Vertos 300. + * Add NO_DOOR_LOCKING configuration option. + * Handle drive_cmd requests w/NULL args (for hdparm -t). + * Work around sporadic Sony55e audio play problem. + * 3.07a Feb 11, 1996 -- check drive->id for NULL before dereferencing, to fix + * problem with "hde=cdrom" with no drive present. -ml + * 3.08 Mar 6, 1996 -- More Vertos workarounds. + * 3.09 Apr 5, 1996 -- Add CDROMCLOSETRAY ioctl. + * Switch to using MSF addressing for audio commands. + * Reformat to match kernel tabbing style. + * Add CDROM_GET_UPC ioctl. + * 3.10 Apr 10, 1996 -- Fix compilation error with STANDARD_ATAPI. + * 3.11 Apr 29, 1996 -- Patch from Heiko Eißfeldt <heiko@colossus.escape.de> + * to remove redundant verify_area calls. + * 3.12 May 7, 1996 -- Rudimentary changer support. Based on patches + * from Gerhard Zuber <zuber@berlin.snafu.de>. + * Let open succeed even if there's no loaded disc. + * 3.13 May 19, 1996 -- Fixes for changer code. + * 3.14 May 29, 1996 -- Add work-around for Vertos 600. + * (From Hennus Bergman <hennus@sky.ow.nl>.) + * 3.15 July 2, 1996 -- Added support for Sanyo 3 CD changers + * from Ben Galliart <bgallia@luc.edu> with + * special help from Jeff Lightfoot + * <jeffml@pobox.com> + * 3.15a July 9, 1996 -- Improved Sanyo 3 CD changer identification + * 3.16 Jul 28, 1996 -- Fix from Gadi to reduce kernel stack usage for ioctl. + * 3.17 Sep 17, 1996 -- Tweak audio reads for some drives. + * Start changing CDROMLOADFROMSLOT to CDROM_SELECT_DISC. + * 3.18 Oct 31, 1996 -- Added module and DMA support. + * + * 4.00 Nov 5, 1996 -- New ide-cd maintainer, + * Erik B. Andersen <andersee@debian.org> + * -- Newer Creative drives don't always set the error + * register correctly. Make sure we see media changes + * regardless. + * -- Integrate with generic cdrom driver. + * -- CDROMGETSPINDOWN and CDROMSETSPINDOWN ioctls, based on + * a patch from Ciro Cattuto <>. + * -- Call set_device_ro. + * -- Implement CDROMMECHANISMSTATUS and CDROMSLOTTABLE + * ioctls, based on patch by Erik Andersen + * -- Add some probes of drive capability during setup. + * + * 4.01 Nov 11, 1996 -- Split into ide-cd.c and ide-cd.h + * -- Removed CDROMMECHANISMSTATUS and CDROMSLOTTABLE + * ioctls in favor of a generalized approach + * using the generic cdrom driver. + * -- Fully integrated with the 2.1.X kernel. + * -- Other stuff that I forgot (lots of changes) + * + * 4.02 Dec 01, 1996 -- Applied patch from Gadi Oxman <gadio@netvision.net.il> + * to fix the drive door locking problems. + * + * 4.03 Dec 04, 1996 -- Added DSC overlap support. + * 4.04 Dec 29, 1996 -- Added CDROMREADRAW ioclt based on patch + * by Ales Makarov (xmakarov@sun.felk.cvut.cz) + * + * 4.05 Nov 20, 1997 -- Modified to print more drive info on init + * Minor other changes + * Fix errors on CDROMSTOP (If you have a "Dolphin", + * you must define IHAVEADOLPHIN) + * Added identifier so new Sanyo CD-changer works + * Better detection if door locking isn't supported + * + * 4.06 Dec 17, 1997 -- fixed endless "tray open" messages -ml + * 4.07 Dec 17, 1997 -- fallback to set pc->stat on "tray open" + * 4.08 Dec 18, 1997 -- spew less noise when tray is empty + * -- fix speed display for ACER 24X, 18X + * 4.09 Jan 04, 1998 -- fix handling of the last block so we return + * an end of file instead of an I/O error (Gadi) + * 4.10 Jan 24, 1998 -- fixed a bug so now changers can change to a new + * slot when there is no disc in the current slot. + * -- Fixed a memory leak where info->changer_info was + * malloc'ed but never free'd when closing the device. + * -- Cleaned up the global namespace a bit by making more + * functions static that should already have been. + * 4.11 Mar 12, 1998 -- Added support for the CDROM_SELECT_SPEED ioctl + * based on a patch for 2.0.33 by Jelle Foks + * <jelle@scintilla.utwente.nl>, a patch for 2.0.33 + * by Toni Giorgino <toni@pcape2.pi.infn.it>, the SCSI + * version, and my own efforts. -erik + * -- Fixed a stupid bug which egcs was kind enough to + * inform me of where "Illegal mode for this track" + * was never returned due to a comparison on data + * types of limited range. + * 4.12 Mar 29, 1998 -- Fixed bug in CDROM_SELECT_SPEED so write speed is + * now set ionly for CD-R and CD-RW drives. I had + * removed this support because it produced errors. + * It produced errors _only_ for non-writers. duh. + * 4.13 May 05, 1998 -- Suppress useless "in progress of becoming ready" + * messages, since this is not an error. + * -- Change error messages to be const + * -- Remove a "\t" which looks ugly in the syslogs + * 4.14 July 17, 1998 -- Change to pointing to .ps version of ATAPI spec + * since the .pdf version doesn't seem to work... + * -- Updated the TODO list to something more current. + * + * 4.15 Aug 25, 1998 -- Updated ide-cd.h to respect mechine endianess, + * patch thanks to "Eddie C. Dost" <ecd@skynet.be> + * + * 4.50 Oct 19, 1998 -- New maintainers! + * Jens Axboe <axboe@image.dk> + * Chris Zwilling <chris@cloudnet.com> + * + * 4.51 Dec 23, 1998 -- Jens Axboe <axboe@image.dk> + * - ide_cdrom_reset enabled since the ide subsystem + * handles resets fine now. <axboe@image.dk> + * - Transfer size fix for Samsung CD-ROMs, thanks to + * "Ville Hallik" <ville.hallik@mail.ee>. + * - other minor stuff. + * + * 4.52 Jan 19, 1999 -- Jens Axboe <axboe@image.dk> + * - Detect DVD-ROM/RAM drives + * + * 4.53 Feb 22, 1999 - Include other model Samsung and one Goldstar + * drive in transfer size limit. + * - Fix the I/O error when doing eject without a medium + * loaded on some drives. + * - CDROMREADMODE2 is now implemented through + * CDROMREADRAW, since many drives don't support + * MODE2 (even though ATAPI 2.6 says they must). + * - Added ignore parameter to ide-cd (as a module), eg + * insmod ide-cd ignore='hda hdb' + * Useful when using ide-cd in conjunction with + * ide-scsi. TODO: non-modular way of doing the + * same. + * + * 4.54 Aug 5, 1999 - Support for MMC2 class commands through the generic + * packet interface to cdrom.c. + * - Unified audio ioctl support, most of it. + * - cleaned up various deprecated verify_area(). + * - Added ide_cdrom_packet() as the interface for + * the Uniform generic_packet(). + * - bunch of other stuff, will fill in logs later. + * - report 1 slot for non-changers, like the other + * cd-rom drivers. don't report select disc for + * non-changers as well. + * - mask out audio playing, if the device can't do it. + * + * 4.55 Sep 1, 1999 - Eliminated the rest of the audio ioctls, except + * for CDROMREADTOC[ENTRY|HEADER]. Some of the drivers + * use this independently of the actual audio handling. + * They will disappear later when I get the time to + * do it cleanly. + * - Minimize the TOC reading - only do it when we + * know a media change has occurred. + * - Moved all the CDROMREADx ioctls to the Uniform layer. + * - Heiko Eißfeldt <heiko@colossus.escape.de> supplied + * some fixes for CDI. + * - CD-ROM leaving door locked fix from Andries + * Brouwer <Andries.Brouwer@cwi.nl> + * - Erik Andersen <andersen@xmission.com> unified + * commands across the various drivers and how + * sense errors are handled. + * + * 4.56 Sep 12, 1999 - Removed changer support - it is now in the + * Uniform layer. + * - Added partition based multisession handling. + * - Mode sense and mode select moved to the + * Uniform layer. + * - Fixed a problem with WPI CDS-32X drive - it + * failed the capabilities + * + * 4.57 Apr 7, 2000 - Fixed sense reporting. + * - Fixed possible oops in ide_cdrom_get_last_session() + * - Fix locking mania and make ide_cdrom_reset relock + * - Stop spewing errors to log when magicdev polls with + * TEST_UNIT_READY on some drives. + * - Various fixes from Tobias Ringstrom: + * tray if it was locked prior to the reset. + * - cdrom_read_capacity returns one frame too little. + * - Fix real capacity reporting. + * + * 4.58 May 1, 2000 - Clean up ACER50 stuff. + * - Fix small problem with ide_cdrom_capacity + * + * 4.59 Aug 11, 2000 - Fix changer problem in cdrom_read_toc, we weren't + * correctly sensing a disc change. + * - Rearranged some code + * - Use extended sense on drives that support it for + * correctly reporting tray status -- from + * Michael D Johnson <johnsom@orst.edu> + * 4.60 Dec 17, 2003 - Add mt rainier support + * - Bump timeout for packet commands, matches sr + * - Odd stuff + * 4.61 Jan 22, 2004 - support hardware sector sizes other than 2kB, + * Pascal Schmidt <der.eremit@email.de> + */ diff --git a/Documentation/ide/ChangeLog.ide-floppy.1996-2002 b/Documentation/ide/ChangeLog.ide-floppy.1996-2002 new file mode 100644 index 000000000000..46c19ef32a9e --- /dev/null +++ b/Documentation/ide/ChangeLog.ide-floppy.1996-2002 @@ -0,0 +1,63 @@ +/* + * Many thanks to Lode Leroy <Lode.Leroy@www.ibase.be>, who tested so many + * ALPHA patches to this driver on an EASYSTOR LS-120 ATAPI floppy drive. + * + * Ver 0.1 Oct 17 96 Initial test version, mostly based on ide-tape.c. + * Ver 0.2 Oct 31 96 Minor changes. + * Ver 0.3 Dec 2 96 Fixed error recovery bug. + * Ver 0.4 Jan 26 97 Add support for the HDIO_GETGEO ioctl. + * Ver 0.5 Feb 21 97 Add partitions support. + * Use the minimum of the LBA and CHS capacities. + * Avoid hwgroup->rq == NULL on the last irq. + * Fix potential null dereferencing with DEBUG_LOG. + * Ver 0.8 Dec 7 97 Increase irq timeout from 10 to 50 seconds. + * Add media write-protect detection. + * Issue START command only if TEST UNIT READY fails. + * Add work-around for IOMEGA ZIP revision 21.D. + * Remove idefloppy_get_capabilities(). + * Ver 0.9 Jul 4 99 Fix a bug which might have caused the number of + * bytes requested on each interrupt to be zero. + * Thanks to <shanos@es.co.nz> for pointing this out. + * Ver 0.9.sv Jan 6 01 Sam Varshavchik <mrsam@courier-mta.com> + * Implement low level formatting. Reimplemented + * IDEFLOPPY_CAPABILITIES_PAGE, since we need the srfp + * bit. My LS-120 drive barfs on + * IDEFLOPPY_CAPABILITIES_PAGE, but maybe it's just me. + * Compromise by not reporting a failure to get this + * mode page. Implemented four IOCTLs in order to + * implement formatting. IOCTls begin with 0x4600, + * 0x46 is 'F' as in Format. + * Jan 9 01 Userland option to select format verify. + * Added PC_SUPPRESS_ERROR flag - some idefloppy drives + * do not implement IDEFLOPPY_CAPABILITIES_PAGE, and + * return a sense error. Suppress error reporting in + * this particular case in order to avoid spurious + * errors in syslog. The culprit is + * idefloppy_get_capability_page(), so move it to + * idefloppy_begin_format() so that it's not used + * unless absolutely necessary. + * If drive does not support format progress indication + * monitor the dsc bit in the status register. + * Also, O_NDELAY on open will allow the device to be + * opened without a disk available. This can be used to + * open an unformatted disk, or get the device capacity. + * Ver 0.91 Dec 11 99 Added IOMEGA Clik! drive support by + * <paul@paulbristow.net> + * Ver 0.92 Oct 22 00 Paul Bristow became official maintainer for this + * driver. Included Powerbook internal zip kludge. + * Ver 0.93 Oct 24 00 Fixed bugs for Clik! drive + * no disk on insert and disk change now works + * Ver 0.94 Oct 27 00 Tidied up to remove strstr(Clik) everywhere + * Ver 0.95 Nov 7 00 Brought across to kernel 2.4 + * Ver 0.96 Jan 7 01 Actually in line with release version of 2.4.0 + * including set_bit patch from Rusty Russell + * Ver 0.97 Jul 22 01 Merge 0.91-0.96 onto 0.9.sv for ac series + * Ver 0.97.sv Aug 3 01 Backported from 2.4.7-ac3 + * Ver 0.98 Oct 26 01 Split idefloppy_transfer_pc into two pieces to + * fix a lost interrupt problem. It appears the busy + * bit was being deasserted by my IOMEGA ATAPI ZIP 100 + * drive before the drive was actually ready. + * Ver 0.98a Oct 29 01 Expose delay value so we can play. + * Ver 0.99 Feb 24 02 Remove duplicate code, modify clik! detection code + * to support new PocketZip drives + */ diff --git a/Documentation/ide/ChangeLog.ide-tape.1995-2002 b/Documentation/ide/ChangeLog.ide-tape.1995-2002 new file mode 100644 index 000000000000..877fac8770b3 --- /dev/null +++ b/Documentation/ide/ChangeLog.ide-tape.1995-2002 @@ -0,0 +1,257 @@ +/* + * Ver 0.1 Nov 1 95 Pre-working code :-) + * Ver 0.2 Nov 23 95 A short backup (few megabytes) and restore procedure + * was successful ! (Using tar cvf ... on the block + * device interface). + * A longer backup resulted in major swapping, bad + * overall Linux performance and eventually failed as + * we received non serial read-ahead requests from the + * buffer cache. + * Ver 0.3 Nov 28 95 Long backups are now possible, thanks to the + * character device interface. Linux's responsiveness + * and performance doesn't seem to be much affected + * from the background backup procedure. + * Some general mtio.h magnetic tape operations are + * now supported by our character device. As a result, + * popular tape utilities are starting to work with + * ide tapes :-) + * The following configurations were tested: + * 1. An IDE ATAPI TAPE shares the same interface + * and irq with an IDE ATAPI CDROM. + * 2. An IDE ATAPI TAPE shares the same interface + * and irq with a normal IDE disk. + * Both configurations seemed to work just fine ! + * However, to be on the safe side, it is meanwhile + * recommended to give the IDE TAPE its own interface + * and irq. + * The one thing which needs to be done here is to + * add a "request postpone" feature to ide.c, + * so that we won't have to wait for the tape to finish + * performing a long media access (DSC) request (such + * as a rewind) before we can access the other device + * on the same interface. This effect doesn't disturb + * normal operation most of the time because read/write + * requests are relatively fast, and once we are + * performing one tape r/w request, a lot of requests + * from the other device can be queued and ide.c will + * service all of them after this single tape request. + * Ver 1.0 Dec 11 95 Integrated into Linux 1.3.46 development tree. + * On each read / write request, we now ask the drive + * if we can transfer a constant number of bytes + * (a parameter of the drive) only to its buffers, + * without causing actual media access. If we can't, + * we just wait until we can by polling the DSC bit. + * This ensures that while we are not transferring + * more bytes than the constant referred to above, the + * interrupt latency will not become too high and + * we won't cause an interrupt timeout, as happened + * occasionally in the previous version. + * While polling for DSC, the current request is + * postponed and ide.c is free to handle requests from + * the other device. This is handled transparently to + * ide.c. The hwgroup locking method which was used + * in the previous version was removed. + * Use of new general features which are provided by + * ide.c for use with atapi devices. + * (Programming done by Mark Lord) + * Few potential bug fixes (Again, suggested by Mark) + * Single character device data transfers are now + * not limited in size, as they were before. + * We are asking the tape about its recommended + * transfer unit and send a larger data transfer + * as several transfers of the above size. + * For best results, use an integral number of this + * basic unit (which is shown during driver + * initialization). I will soon add an ioctl to get + * this important parameter. + * Our data transfer buffer is allocated on startup, + * rather than before each data transfer. This should + * ensure that we will indeed have a data buffer. + * Ver 1.1 Dec 14 95 Fixed random problems which occurred when the tape + * shared an interface with another device. + * (poll_for_dsc was a complete mess). + * Removed some old (non-active) code which had + * to do with supporting buffer cache originated + * requests. + * The block device interface can now be opened, so + * that general ide driver features like the unmask + * interrupts flag can be selected with an ioctl. + * This is the only use of the block device interface. + * New fast pipelined operation mode (currently only on + * writes). When using the pipelined mode, the + * throughput can potentially reach the maximum + * tape supported throughput, regardless of the + * user backup program. On my tape drive, it sometimes + * boosted performance by a factor of 2. Pipelined + * mode is enabled by default, but since it has a few + * downfalls as well, you may want to disable it. + * A short explanation of the pipelined operation mode + * is available below. + * Ver 1.2 Jan 1 96 Eliminated pipelined mode race condition. + * Added pipeline read mode. As a result, restores + * are now as fast as backups. + * Optimized shared interface behavior. The new behavior + * typically results in better IDE bus efficiency and + * higher tape throughput. + * Pre-calculation of the expected read/write request + * service time, based on the tape's parameters. In + * the pipelined operation mode, this allows us to + * adjust our polling frequency to a much lower value, + * and thus to dramatically reduce our load on Linux, + * without any decrease in performance. + * Implemented additional mtio.h operations. + * The recommended user block size is returned by + * the MTIOCGET ioctl. + * Additional minor changes. + * Ver 1.3 Feb 9 96 Fixed pipelined read mode bug which prevented the + * use of some block sizes during a restore procedure. + * The character device interface will now present a + * continuous view of the media - any mix of block sizes + * during a backup/restore procedure is supported. The + * driver will buffer the requests internally and + * convert them to the tape's recommended transfer + * unit, making performance almost independent of the + * chosen user block size. + * Some improvements in error recovery. + * By cooperating with ide-dma.c, bus mastering DMA can + * now sometimes be used with IDE tape drives as well. + * Bus mastering DMA has the potential to dramatically + * reduce the CPU's overhead when accessing the device, + * and can be enabled by using hdparm -d1 on the tape's + * block device interface. For more info, read the + * comments in ide-dma.c. + * Ver 1.4 Mar 13 96 Fixed serialize support. + * Ver 1.5 Apr 12 96 Fixed shared interface operation, broken in 1.3.85. + * Fixed pipelined read mode inefficiency. + * Fixed nasty null dereferencing bug. + * Ver 1.6 Aug 16 96 Fixed FPU usage in the driver. + * Fixed end of media bug. + * Ver 1.7 Sep 10 96 Minor changes for the CONNER CTT8000-A model. + * Ver 1.8 Sep 26 96 Attempt to find a better balance between good + * interactive response and high system throughput. + * Ver 1.9 Nov 5 96 Automatically cross encountered filemarks rather + * than requiring an explicit FSF command. + * Abort pending requests at end of media. + * MTTELL was sometimes returning incorrect results. + * Return the real block size in the MTIOCGET ioctl. + * Some error recovery bug fixes. + * Ver 1.10 Nov 5 96 Major reorganization. + * Reduced CPU overhead a bit by eliminating internal + * bounce buffers. + * Added module support. + * Added multiple tape drives support. + * Added partition support. + * Rewrote DSC handling. + * Some portability fixes. + * Removed ide-tape.h. + * Additional minor changes. + * Ver 1.11 Dec 2 96 Bug fix in previous DSC timeout handling. + * Use ide_stall_queue() for DSC overlap. + * Use the maximum speed rather than the current speed + * to compute the request service time. + * Ver 1.12 Dec 7 97 Fix random memory overwriting and/or last block data + * corruption, which could occur if the total number + * of bytes written to the tape was not an integral + * number of tape blocks. + * Add support for INTERRUPT DRQ devices. + * Ver 1.13 Jan 2 98 Add "speed == 0" work-around for HP COLORADO 5GB + * Ver 1.14 Dec 30 98 Partial fixes for the Sony/AIWA tape drives. + * Replace cli()/sti() with hwgroup spinlocks. + * Ver 1.15 Mar 25 99 Fix SMP race condition by replacing hwgroup + * spinlock with private per-tape spinlock. + * Ver 1.16 Sep 1 99 Add OnStream tape support. + * Abort read pipeline on EOD. + * Wait for the tape to become ready in case it returns + * "in the process of becoming ready" on open(). + * Fix zero padding of the last written block in + * case the tape block size is larger than PAGE_SIZE. + * Decrease the default disconnection time to tn. + * Ver 1.16e Oct 3 99 Minor fixes. + * Ver 1.16e1 Oct 13 99 Patches by Arnold Niessen, + * niessen@iae.nl / arnold.niessen@philips.com + * GO-1) Undefined code in idetape_read_position + * according to Gadi's email + * AJN-1) Minor fix asc == 11 should be asc == 0x11 + * in idetape_issue_packet_command (did effect + * debugging output only) + * AJN-2) Added more debugging output, and + * added ide-tape: where missing. I would also + * like to add tape->name where possible + * AJN-3) Added different debug_level's + * via /proc/ide/hdc/settings + * "debug_level" determines amount of debugging output; + * can be changed using /proc/ide/hdx/settings + * 0 : almost no debugging output + * 1 : 0+output errors only + * 2 : 1+output all sensekey/asc + * 3 : 2+follow all chrdev related procedures + * 4 : 3+follow all procedures + * 5 : 4+include pc_stack rq_stack info + * 6 : 5+USE_COUNT updates + * AJN-4) Fixed timeout for retension in idetape_queue_pc_tail + * from 5 to 10 minutes + * AJN-5) Changed maximum number of blocks to skip when + * reading tapes with multiple consecutive write + * errors from 100 to 1000 in idetape_get_logical_blk + * Proposed changes to code: + * 1) output "logical_blk_num" via /proc + * 2) output "current_operation" via /proc + * 3) Either solve or document the fact that `mt rewind' is + * required after reading from /dev/nhtx to be + * able to rmmod the idetape module; + * Also, sometimes an application finishes but the + * device remains `busy' for some time. Same cause ? + * Proposed changes to release-notes: + * 4) write a simple `quickstart' section in the + * release notes; I volunteer if you don't want to + * 5) include a pointer to video4linux in the doc + * to stimulate video applications + * 6) release notes lines 331 and 362: explain what happens + * if the application data rate is higher than 1100 KB/s; + * similar approach to lower-than-500 kB/s ? + * 7) 6.6 Comparison; wouldn't it be better to allow different + * strategies for read and write ? + * Wouldn't it be better to control the tape buffer + * contents instead of the bandwidth ? + * 8) line 536: replace will by would (if I understand + * this section correctly, a hypothetical and unwanted situation + * is being described) + * Ver 1.16f Dec 15 99 Change place of the secondary OnStream header frames. + * Ver 1.17 Nov 2000 / Jan 2001 Marcel Mol, marcel@mesa.nl + * - Add idetape_onstream_mode_sense_tape_parameter_page + * function to get tape capacity in frames: tape->capacity. + * - Add support for DI-50 drives( or any DI- drive). + * - 'workaround' for read error/blank block around block 3000. + * - Implement Early warning for end of media for Onstream. + * - Cosmetic code changes for readability. + * - Idetape_position_tape should not use SKIP bit during + * Onstream read recovery. + * - Add capacity, logical_blk_num and first/last_frame_position + * to /proc/ide/hd?/settings. + * - Module use count was gone in the Linux 2.4 driver. + * Ver 1.17a Apr 2001 Willem Riede osst@riede.org + * - Get drive's actual block size from mode sense block descriptor + * - Limit size of pipeline + * Ver 1.17b Oct 2002 Alan Stern <stern@rowland.harvard.edu> + * Changed IDETAPE_MIN_PIPELINE_STAGES to 1 and actually used + * it in the code! + * Actually removed aborted stages in idetape_abort_pipeline + * instead of just changing the command code. + * Made the transfer byte count for Request Sense equal to the + * actual length of the data transfer. + * Changed handling of partial data transfers: they do not + * cause DMA errors. + * Moved initiation of DMA transfers to the correct place. + * Removed reference to unallocated memory. + * Made __idetape_discard_read_pipeline return the number of + * sectors skipped, not the number of stages. + * Replaced errant kfree() calls with __idetape_kfree_stage(). + * Fixed off-by-one error in testing the pipeline length. + * Fixed handling of filemarks in the read pipeline. + * Small code optimization for MTBSF and MTBSFM ioctls. + * Don't try to unlock the door during device close if is + * already unlocked! + * Cosmetic fixes to miscellaneous debugging output messages. + * Set the minimum /proc/ide/hd?/settings values for "pipeline", + * "pipeline_min", and "pipeline_max" to 1. + */ diff --git a/Documentation/ide/ide-tape.txt b/Documentation/ide/ide-tape.txt new file mode 100644 index 000000000000..658f271a373f --- /dev/null +++ b/Documentation/ide/ide-tape.txt @@ -0,0 +1,146 @@ +/* + * IDE ATAPI streaming tape driver. + * + * This driver is a part of the Linux ide driver. + * + * The driver, in co-operation with ide.c, basically traverses the + * request-list for the block device interface. The character device + * interface, on the other hand, creates new requests, adds them + * to the request-list of the block device, and waits for their completion. + * + * Pipelined operation mode is now supported on both reads and writes. + * + * The block device major and minor numbers are determined from the + * tape's relative position in the ide interfaces, as explained in ide.c. + * + * The character device interface consists of the following devices: + * + * ht0 major 37, minor 0 first IDE tape, rewind on close. + * ht1 major 37, minor 1 second IDE tape, rewind on close. + * ... + * nht0 major 37, minor 128 first IDE tape, no rewind on close. + * nht1 major 37, minor 129 second IDE tape, no rewind on close. + * ... + * + * The general magnetic tape commands compatible interface, as defined by + * include/linux/mtio.h, is accessible through the character device. + * + * General ide driver configuration options, such as the interrupt-unmask + * flag, can be configured by issuing an ioctl to the block device interface, + * as any other ide device. + * + * Our own ide-tape ioctl's can be issued to either the block device or + * the character device interface. + * + * Maximal throughput with minimal bus load will usually be achieved in the + * following scenario: + * + * 1. ide-tape is operating in the pipelined operation mode. + * 2. No buffering is performed by the user backup program. + * + * Testing was done with a 2 GB CONNER CTMA 4000 IDE ATAPI Streaming Tape Drive. + * + * Here are some words from the first releases of hd.c, which are quoted + * in ide.c and apply here as well: + * + * | Special care is recommended. Have Fun! + * + * + * An overview of the pipelined operation mode. + * + * In the pipelined write mode, we will usually just add requests to our + * pipeline and return immediately, before we even start to service them. The + * user program will then have enough time to prepare the next request while + * we are still busy servicing previous requests. In the pipelined read mode, + * the situation is similar - we add read-ahead requests into the pipeline, + * before the user even requested them. + * + * The pipeline can be viewed as a "safety net" which will be activated when + * the system load is high and prevents the user backup program from keeping up + * with the current tape speed. At this point, the pipeline will get + * shorter and shorter but the tape will still be streaming at the same speed. + * Assuming we have enough pipeline stages, the system load will hopefully + * decrease before the pipeline is completely empty, and the backup program + * will be able to "catch up" and refill the pipeline again. + * + * When using the pipelined mode, it would be best to disable any type of + * buffering done by the user program, as ide-tape already provides all the + * benefits in the kernel, where it can be done in a more efficient way. + * As we will usually not block the user program on a request, the most + * efficient user code will then be a simple read-write-read-... cycle. + * Any additional logic will usually just slow down the backup process. + * + * Using the pipelined mode, I get a constant over 400 KBps throughput, + * which seems to be the maximum throughput supported by my tape. + * + * However, there are some downfalls: + * + * 1. We use memory (for data buffers) in proportional to the number + * of pipeline stages (each stage is about 26 KB with my tape). + * 2. In the pipelined write mode, we cheat and postpone error codes + * to the user task. In read mode, the actual tape position + * will be a bit further than the last requested block. + * + * Concerning (1): + * + * 1. We allocate stages dynamically only when we need them. When + * we don't need them, we don't consume additional memory. In + * case we can't allocate stages, we just manage without them + * (at the expense of decreased throughput) so when Linux is + * tight in memory, we will not pose additional difficulties. + * + * 2. The maximum number of stages (which is, in fact, the maximum + * amount of memory) which we allocate is limited by the compile + * time parameter IDETAPE_MAX_PIPELINE_STAGES. + * + * 3. The maximum number of stages is a controlled parameter - We + * don't start from the user defined maximum number of stages + * but from the lower IDETAPE_MIN_PIPELINE_STAGES (again, we + * will not even allocate this amount of stages if the user + * program can't handle the speed). We then implement a feedback + * loop which checks if the pipeline is empty, and if it is, we + * increase the maximum number of stages as necessary until we + * reach the optimum value which just manages to keep the tape + * busy with minimum allocated memory or until we reach + * IDETAPE_MAX_PIPELINE_STAGES. + * + * Concerning (2): + * + * In pipelined write mode, ide-tape can not return accurate error codes + * to the user program since we usually just add the request to the + * pipeline without waiting for it to be serviced. In case an error + * occurs, I will report it on the next user request. + * + * In the pipelined read mode, subsequent read requests or forward + * filemark spacing will perform correctly, as we preserve all blocks + * and filemarks which we encountered during our excess read-ahead. + * + * For accurate tape positioning and error reporting, disabling + * pipelined mode might be the best option. + * + * You can enable/disable/tune the pipelined operation mode by adjusting + * the compile time parameters below. + * + * + * Possible improvements. + * + * 1. Support for the ATAPI overlap protocol. + * + * In order to maximize bus throughput, we currently use the DSC + * overlap method which enables ide.c to service requests from the + * other device while the tape is busy executing a command. The + * DSC overlap method involves polling the tape's status register + * for the DSC bit, and servicing the other device while the tape + * isn't ready. + * + * In the current QIC development standard (December 1995), + * it is recommended that new tape drives will *in addition* + * implement the ATAPI overlap protocol, which is used for the + * same purpose - efficient use of the IDE bus, but is interrupt + * driven and thus has much less CPU overhead. + * + * ATAPI overlap is likely to be supported in most new ATAPI + * devices, including new ATAPI cdroms, and thus provides us + * a method by which we can achieve higher throughput when + * sharing a (fast) ATA-2 disk with any (slow) new ATAPI device. + */ diff --git a/Documentation/initrd.txt b/Documentation/initrd.txt index 74f68b35f7c1..1ba84f3584e3 100644 --- a/Documentation/initrd.txt +++ b/Documentation/initrd.txt @@ -85,7 +85,7 @@ involve special block devices or loopbacks; you merely create a directory on disk with the desired initrd content, cd to that directory, and run (as an example): -find . | cpio --quiet -c -o | gzip -9 -n > /boot/imagefile.img +find . | cpio --quiet -H newc -o | gzip -9 -n > /boot/imagefile.img Examining the contents of an existing image file is just as simple: diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt index 5c7fbf9d96b4..c18363bd8d11 100644 --- a/Documentation/ioctl-number.txt +++ b/Documentation/ioctl-number.txt @@ -138,6 +138,7 @@ Code Seq# Include File Comments 'm' 00-1F net/irda/irmod.h conflict! 'n' 00-7F linux/ncp_fs.h 'n' E0-FF video/matrox.h matroxfb +'o' 00-1F fs/ocfs2/ocfs2_fs.h OCFS2 'p' 00-0F linux/phantom.h conflict! (OpenHaptics needs this) 'p' 00-3F linux/mc146818rtc.h conflict! 'p' 40-7F linux/nvram.h diff --git a/Documentation/ja_JP/HOWTO b/Documentation/ja_JP/HOWTO index d9d832c010ef..488c77fa3aae 100644 --- a/Documentation/ja_JP/HOWTO +++ b/Documentation/ja_JP/HOWTO @@ -11,14 +11,14 @@ for non English (read: Japanese) speakers and is not intended as a fork. So if you have any comments or updates for this file, please try to update the original English file first. -Last Updated: 2007/09/23 +Last Updated: 2007/11/16 ================================== これは、 -linux-2.6.23/Documentation/HOWTO +linux-2.6.24/Documentation/HOWTO の和訳です。 翻訳団体: JF プロジェクト < http://www.linux.or.jp/JF/ > -翻訳日: 2007/09/19 +翻訳日: 2007/11/10 翻訳者: Tsugikazu Shibata <tshibata at ab dot jp dot nec dot com> 校正者: 松倉さん <nbh--mats at nifty dot com> 小林 雅典さん (Masanori Kobayasi) <zap03216 at nifty dot ne dot jp> @@ -110,7 +110,7 @@ Linux カーネルソースツリーは幅広い範囲のドキュメントを 新しいドキュメントファイルも追加することを勧めます。 カーネルの変更が、カーネルがユーザ空間に公開しているインターフェイスの 変更を引き起こす場合、その変更を説明するマニュアルページのパッチや情報 -をマニュアルページのメンテナ mtk-manpages@gmx.net に送ることを勧めま +をマニュアルページのメンテナ mtk.manpages@gmail.com に送ることを勧めま す。 以下はカーネルソースツリーに含まれている読んでおくべきファイルの一覧で diff --git a/Documentation/ja_JP/stable_kernel_rules.txt b/Documentation/ja_JP/stable_kernel_rules.txt new file mode 100644 index 000000000000..17d87519e468 --- /dev/null +++ b/Documentation/ja_JP/stable_kernel_rules.txt @@ -0,0 +1,79 @@ +NOTE: +This is Japanese translated version of "Documentation/stable_kernel_rules.txt". +This one is maintained by Tsugikazu Shibata <tshibata@ab.jp.nec.com> +and JF Project team <www.linux.or.jp/JF>. +If you find difference with original file or problem in translation, +please contact maintainer of this file or JF project. + +Please also note that purpose of this file is easier to read for non +English natives and do no intended to fork. So, if you have any +comment or update of this file, please try to update Original(English) +file at first. + +================================== +ããã¯ã +linux-2.6.24/Documentation/stable_kernel_rules.txt +ã®å訳ã§ãã + +翻訳å£ä½ï¼ JF ããã¸ã§ã¯ã < http://www.linux.or.jp/JF/ > +翻訳æ¥ï¼ 2007/12/30 +翻訳è
ï¼ Tsugikazu Shibata <tshibata at ab dot jp dot nec dot com> +æ ¡æ£è
ï¼ æ¦äºä¼¸å
ããã<takei at webmasters dot gr dot jp> + ããããã (Seiji Kaneko) <skaneko at a2 dot mbn dot or dot jp> + å°æ é
å
¸ãã (Masanori Kobayasi) <zap03216 at nifty dot ne dot jp> + éå£ãã (Kenji Noguchi) <tokyo246 at gmail dot com> + ç¥å®®ä¿¡å¤ªéãã <jin at libjingu dot jp> +================================== + +ãã£ã¨ç¥ãããã£ã Linux 2.6 -stable ãªãªã¼ã¹ã®å
¨ã¦ + +"-stable" ããªã¼ã«ã©ã®ãããªç¨®é¡ã®ããããåãå
¥ããããããã©ã®ãã㪠+ãã®ãåãå
¥ããããªãããã«ã¤ãã¦ã®è¦å- + + - æããã«æ£ããããã¹ãããã¦ãããã®ã§ãªããã°ãªããªãã + - æè(å¤æ´è¡ã®åå¾)ãå«ã㦠100 è¡ãã大ããã¦ã¯ãããªãã + - ãã ä¸åã®ãã¨ã ããä¿®æ£ãã¦ããã¹ãã + - çãæ©ã¾ãã¦ããæ¬ç©ã®ãã°ãä¿®æ£ããªããã°ãªããªãã("ããã¯ãã°ã§ + ãããããããªãã..." ã®ãããªãã®ã§ã¯ãªã) + - ãã«ãã¨ã©ã¼(CONFIG_BROKENã«ãªã£ã¦ãããã®ãé¤ã), oops, ãã³ã°ããã¼ + ã¿ç ´å£ãç¾å®ã®ã»ãã¥ãªãã£åé¡ããã®ä» "ãããããã¯ãã¡ã ã"ã¨ãã + ãããªãã®ãä¿®æ£ããªããã°ãªããªããçãè¨ãã°ãé大ãªåé¡ã + - ã©ã®ããã«ç«¶åç¶æ
ãçºçãããã®èª¬æãä¸ç·ã«æ¸ããã¦ããªãéãã + "çè«çã«ã¯ç«¶åç¶æ
ã«ãªã"ãããªãã®ã¯ä¸å¯ã + - ãããªãäºç´°ãªä¿®æ£ãå«ãããã¨ã¯ã§ããªãã(ã¹ãã«ã®ä¿®æ£ã空ç½ã®ã¯ãªã¼ + ã³ã¢ãããªã©) + - 対å¿ãããµãã·ã¹ãã ã¡ã³ãããåãå
¥ãããã®ã§ãªããã°ãªããªãã + - Documentation/SubmittingPatches ã®è¦åã«å¾ã£ããã®ã§ãªããã°ãªããªãã + +-stable ããªã¼ã«ããããéä»ããæç¶ã- + + - ä¸è¨ã®è¦åã«å¾ã£ã¦ãããã確èªããå¾ã«ãstable@kernel.org ã«ããã + ãéãã + - éä¿¡è
ã¯ãããããã¥ã¼ã«åãä»ããããéã«ã¯ ACK ããå´ä¸ãããå ´å + ã«ã¯ NAK ãåãåãããã®åå¿ã¯éçºè
ãã¡ã®ã¹ã±ã¸ã¥ã¼ã«ã«ãã£ã¦ãæ° + æ¥ãããå ´åãããã + - ããåãåããããããããã¯ä»ã®éçºè
ãã¡ã®ã¬ãã¥ã¼ã®ããã« + -stable ãã¥ã¼ã«è¿½å ãããã + - ã»ãã¥ãªãã£ãããã¯ãã®ã¨ã¤ãªã¢ã¹ (stable@kernel.org) ã«éãããã¹ + ãã§ã¯ãªãã代ããã« security@kernel.org ã®ã¢ãã¬ã¹ã«éãããã + +ã¬ãã¥ã¼ãµã¤ã¯ã«- + + - -stable ã¡ã³ãããã¬ãã¥ã¼ãµã¤ã¯ã«ã決ããã¨ãããããã¯ã¬ãã¥ã¼å§ + å¡ä¼ã¨ããããå½±é¿ããé åã®ã¡ã³ãã(æä¾è
ããã®é åã®ã¡ã³ããã§ç¡ + ãéã)ã«éãããlinux-kernel ã¡ã¼ãªã³ã°ãªã¹ãã«CCãããã + - ã¬ãã¥ã¼å§å¡ä¼ã¯ 48æéã®éã« ACK ã NAK ãåºãã + - ããããããå§å¡ä¼ã®ã¡ã³ãããå´ä¸ããããã¡ã³ããéãã¡ã³ããæ°ä» + ããªãã£ãåé¡ãæã¡ããããlinux-kernel ã¡ã³ãããããã«ç°è°ãå±ã + ãå ´åã«ã¯ããããã¯ãã¥ã¼ããåé¤ãããã + - ã¬ãã¥ã¼ãµã¤ã¯ã«ã®æå¾ã«ãACK ãåãããããã¯ææ°ã® -stable ãªãªã¼ + ã¹ã«è¿½å ããããã®å¾ã«æ°ãã -stable ãªãªã¼ã¹ãè¡ãããã + - ã»ãã¥ãªãã£ãããã¯ãé常ã®ã¬ãã¥ã¼ãµã¤ã¯ã«ãéãããã»ãã¥ãªã㣠+ ã«ã¼ãã«ãã¼ã ããç´æ¥ -stable ããªã¼ã«åãä»ããããã + ãã®æç¶ãã®è©³ç´°ã«ã¤ãã¦ã¯ kernel security ãã¼ã ã«åãåããããã¨ã + +ã¬ãã¥ã¼å§å¡ä¼- + + - ãã®å§å¡ä¼ã¯ããã®ã¿ã¹ã¯ã«ã¤ãã¦æ´»åããå¤ãã®ãã©ã³ãã£ã¢ã¨ãå°æ°ã® + éãã©ã³ãã£ã¢ã®ã«ã¼ãã«éçºè
éã§æ§æããã¦ããã + diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt index 616043a6da99..649cb8799890 100644 --- a/Documentation/kbuild/kconfig-language.txt +++ b/Documentation/kbuild/kconfig-language.txt @@ -24,7 +24,7 @@ visible if its parent entry is also visible. Menu entries ------------ -Most entries define a config option, all other entries help to organize +Most entries define a config option; all other entries help to organize them. A single configuration option is defined like this: config MODVERSIONS @@ -50,7 +50,7 @@ applicable everywhere (see syntax). - type definition: "bool"/"tristate"/"string"/"hex"/"int" Every config option must have a type. There are only two basic types: - tristate and string, the other types are based on these two. The type + tristate and string; the other types are based on these two. The type definition optionally accepts an input prompt, so these two examples are equivalent: @@ -108,7 +108,7 @@ applicable everywhere (see syntax). equal to 'y' without visiting the dependencies. So abusing select you are able to select a symbol FOO even if FOO depends on BAR that is not set. In general use select only for - non-visible symbols (no promts anywhere) and for symbols with + non-visible symbols (no prompts anywhere) and for symbols with no dependencies. That will limit the usefulness but on the other hand avoid the illegal configurations all over. kconfig should one day warn about such things. @@ -127,6 +127,27 @@ applicable everywhere (see syntax). used to help visually separate configuration logic from help within the file as an aid to developers. +- misc options: "option" <symbol>[=<value>] + Various less common options can be defined via this option syntax, + which can modify the behaviour of the menu entry and its config + symbol. These options are currently possible: + + - "defconfig_list" + This declares a list of default entries which can be used when + looking for the default configuration (which is used when the main + .config doesn't exists yet.) + + - "modules" + This declares the symbol to be used as the MODULES symbol, which + enables the third modular state for all config symbols. + + - "env"=<value> + This imports the environment variable into Kconfig. It behaves like + a default, except that the value comes from the environment, this + also means that the behaviour when mixing it with normal defaults is + undefined at this point. The symbol is currently not exported back + to the build environment (if this is desired, it can be done via + another symbol). Menu dependencies ----------------- @@ -162,9 +183,9 @@ An expression can have a value of 'n', 'm' or 'y' (or 0, 1, 2 respectively for calculations). A menu entry becomes visible when it's expression evaluates to 'm' or 'y'. -There are two types of symbols: constant and nonconstant symbols. -Nonconstant symbols are the most common ones and are defined with the -'config' statement. Nonconstant symbols consist entirely of alphanumeric +There are two types of symbols: constant and non-constant symbols. +Non-constant symbols are the most common ones and are defined with the +'config' statement. Non-constant symbols consist entirely of alphanumeric characters or underscores. Constant symbols are only part of expressions. Constant symbols are always surrounded by single or double quotes. Within the quote, any @@ -301,3 +322,81 @@ mainmenu: This sets the config program's title bar if the config program chooses to use it. + + +Kconfig hints +------------- +This is a collection of Kconfig tips, most of which aren't obvious at +first glance and most of which have become idioms in several Kconfig +files. + +Adding common features and make the usage configurable +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +It is a common idiom to implement a feature/functionality that are +relevant for some architectures but not all. +The recommended way to do so is to use a config variable named HAVE_* +that is defined in a common Kconfig file and selected by the relevant +architectures. +An example is the generic IOMAP functionality. + +We would in lib/Kconfig see: + +# Generic IOMAP is used to ... +config HAVE_GENERIC_IOMAP + +config GENERIC_IOMAP + depends on HAVE_GENERIC_IOMAP && FOO + +And in lib/Makefile we would see: +obj-$(CONFIG_GENERIC_IOMAP) += iomap.o + +For each architecture using the generic IOMAP functionality we would see: + +config X86 + select ... + select HAVE_GENERIC_IOMAP + select ... + +Note: we use the existing config option and avoid creating a new +config variable to select HAVE_GENERIC_IOMAP. + +Note: the use of the internal config variable HAVE_GENERIC_IOMAP, it is +introduced to overcome the limitation of select which will force a +config option to 'y' no matter the dependencies. +The dependencies are moved to the symbol GENERIC_IOMAP and we avoid the +situation where select forces a symbol equals to 'y'. + +Build as module only +~~~~~~~~~~~~~~~~~~~~ +To restrict a component build to module-only, qualify its config symbol +with "depends on m". E.g.: + +config FOO + depends on BAR && m + +limits FOO to module (=m) or disabled (=n). + + +Build limited by a third config symbol which may be =y or =m +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A common idiom that we see (and sometimes have problems with) is this: + +When option C in B (module or subsystem) uses interfaces from A (module +or subsystem), and both A and B are tristate (could be =y or =m if they +were independent of each other, but they aren't), then we need to limit +C such that it cannot be built statically if A is built as a loadable +module. (C already depends on B, so there is no dependency issue to +take care of here.) + +If A is linked statically into the kernel image, C can be built +statically or as loadable module(s). However, if A is built as loadable +module(s), then C must be restricted to loadable module(s) also. This +can be expressed in kconfig language as: + +config C + depends on A = y || A = B + +or for real examples, use this command in a kernel tree: + +$ find . -name Kconfig\* | xargs grep -ns "depends on.*=.*||.*=" | grep -v orig + diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 33121d6c827c..8fd5aa40585f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -34,6 +34,7 @@ parameter is applicable: ALSA ALSA sound support is enabled. APIC APIC support is enabled. APM Advanced Power Management support is enabled. + AVR32 AVR32 architecture is enabled. AX25 Appropriate AX.25 support is enabled. BLACKFIN Blackfin architecture is enabled. DRM Direct Rendering Management support is enabled. @@ -167,6 +168,11 @@ and is between 256 and 4096 characters. It is defined in the file acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA Format: <irq>,<irq>... + acpi_new_pts_ordering [HW,ACPI] + Enforce the ACPI 2.0 ordering of the _PTS control + method wrt putting devices into low power states + default: pre ACPI 2.0 ordering of _PTS + acpi_no_auto_ssdt [HW,ACPI] Disable automatic loading of SSDT acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS @@ -369,7 +375,8 @@ and is between 256 and 4096 characters. It is defined in the file configured. Potentially dangerous and should only be used if you are entirely sure of the consequences. - chandev= [HW,NET] Generic channel device initialisation + ccw_timeout_log [S390] + See Documentation/s390/CommonIO for details. checkreqprot [SELINUX] Set initial checkreqprot flag value. Format: { "0" | "1" } @@ -381,6 +388,12 @@ and is between 256 and 4096 characters. It is defined in the file Value can be changed at runtime via /selinux/checkreqprot. + cio_ignore= [S390] + See Documentation/s390/CommonIO for details. + + cio_msg= [S390] + See Documentation/s390/CommonIO for details. + clock= [BUGS=X86-32, HW] gettimeofday clocksource override. [Deprecated] Forces specified clocksource (if available) to be used @@ -408,8 +421,21 @@ and is between 256 and 4096 characters. It is defined in the file [SPARC64] tick [X86-64] hpet,tsc - code_bytes [IA32] How many bytes of object code to print in an - oops report. + clearcpuid=BITNUM [X86] + Disable CPUID feature X for the kernel. See + include/asm-x86/cpufeature.h for the valid bit numbers. + Note the Linux specific bits are not necessarily + stable over kernel options, but the vendor specific + ones should be. + Also note that user programs calling CPUID directly + or using the feature without checking anything + will still see it. This just prevents it from + being used by the kernel or shown in /proc/cpuinfo. + Also note the kernel might malfunction if you disable + some critical bits. + + code_bytes [IA32/X86_64] How many bytes of object code to print + in an oops report. Range: 0 - 8192 Default: 64 @@ -523,33 +549,34 @@ and is between 256 and 4096 characters. It is defined in the file 1 will print _a lot_ more information - normally only useful to kernel developers. - decnet= [HW,NET] + decnet.addr= [HW,NET] Format: <area>[,<node>] See also Documentation/networking/decnet.txt. - default_blu= [VT] + vt.default_blu= [VT] Format: <blue0>,<blue1>,<blue2>,...,<blue15> Change the default blue palette of the console. This is a 16-member array composed of values ranging from 0-255. - default_grn= [VT] + vt.default_grn= [VT] Format: <green0>,<green1>,<green2>,...,<green15> Change the default green palette of the console. This is a 16-member array composed of values ranging from 0-255. - default_red= [VT] + vt.default_red= [VT] Format: <red0>,<red1>,<red2>,...,<red15> Change the default red palette of the console. This is a 16-member array composed of values ranging from 0-255. - default_utf8= [VT] + vt.default_utf8= + [VT] Format=<0|1> Set system-wide default UTF-8 mode for all tty's. - Default is 0 and by setting to 1, it enables UTF-8 - mode for all newly opened or allocated terminals. + Default is 1, i.e. UTF-8 mode is enabled for all + newly opened terminals. dhash_entries= [KNL] Set number of hash buckets for dentry cache. @@ -561,6 +588,12 @@ and is between 256 and 4096 characters. It is defined in the file See drivers/char/README.epca and Documentation/digiepca.txt. + disable_mtrr_trim [X86, Intel and AMD only] + By default the kernel will trim any uncacheable + memory out of your available memory pool based on + MTRR settings. This parameter disables that behavior, + possibly causing your machine to run very slowly. + dmasound= [HW,OSS] Sound subsystem buffers dscc4.setup= [NET] @@ -651,6 +684,10 @@ and is between 256 and 4096 characters. It is defined in the file gamma= [HW,DRM] + gart_fix_e820= [X86_64] disable the fix e820 for K8 GART + Format: off | on + default: on + gdth= [HW,SCSI] See header of drivers/scsi/gdth.c. @@ -685,6 +722,7 @@ and is between 256 and 4096 characters. It is defined in the file See Documentation/isdn/README.HiSax. hugepages= [HW,X86-32,IA-64] Maximal number of HugeTLB pages. + hugepagesz= [HW,IA-64,PPC] The size of the HugeTLB pages. i8042.direct [HW] Put keyboard port into non-translated mode i8042.dumbkbd [HW] Pretend that controller can only read data from @@ -742,6 +780,9 @@ and is between 256 and 4096 characters. It is defined in the file loop use the MONITOR/MWAIT idle loop anyways. Performance should be the same as idle=poll. + ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem + Claim all unknown PCI IDE storage controllers. + ignore_loglevel [KNL] Ignore loglevel setting - this will print /all/ kernel messages to the console. Useful for debugging. @@ -785,6 +826,16 @@ and is between 256 and 4096 characters. It is defined in the file for translation below 32 bit and if not available then look in the higher range. + io_delay= [X86-32,X86-64] I/O delay method + 0x80 + Standard port 0x80 based delay + 0xed + Alternate port 0xed based delay (needed on some systems) + udelay + Simple two microseconds delay + none + No delay + io7= [HW] IO7 for Marvel based alpha systems See comment before marvel_specify_io7 in arch/alpha/kernel/core_marvel.c. @@ -882,6 +933,14 @@ and is between 256 and 4096 characters. It is defined in the file lapic_timer_c2_ok [X86-32,x86-64,APIC] trust the local apic timer in C2 power state. + libata.dma= [LIBATA] DMA control + libata.dma=0 Disable all PATA and SATA DMA + libata.dma=1 PATA and SATA Disk DMA only + libata.dma=2 ATAPI (CDROM) DMA only + libata.dma=4 Compact Flash DMA only + Combinations also work, so libata.dma=3 enables DMA + for disks and CDROMs, but not CFs. + libata.noacpi [LIBATA] Disables use of ACPI in libata suspend/resume when set. Format: <int> @@ -1042,6 +1101,11 @@ and is between 256 and 4096 characters. It is defined in the file Multi-Function General Purpose Timers on AMD Geode platforms. + mfgptfix [X86-32] Fix MFGPT timers on AMD Geode platforms when + the BIOS has incorrectly applied a workaround. TinyBIOS + version 0.98 is known to be affected, 0.99 fixes the + problem by letting the user disable the workaround. + mga= [HW,DRM] mousedev.tap_time= @@ -1114,6 +1178,10 @@ and is between 256 and 4096 characters. It is defined in the file of returning the full 64-bit number. The default is to return 64-bit inode numbers. + nmi_debug= [KNL,AVR32] Specify one or more actions to take + when a NMI is triggered. + Format: [state][,regs][,debounce][,die] + nmi_watchdog= [KNL,BUGS=X86-32] Debugging features for SMP kernels no387 [BUGS=X86-32] Tells the kernel to use the 387 maths @@ -1138,6 +1206,8 @@ and is between 256 and 4096 characters. It is defined in the file nodisconnect [HW,SCSI,M68K] Disables SCSI disconnects. + noefi [X86-32,X86-64] Disable EFI runtime services support. + noexec [IA-64] noexec [X86-32,X86-64] @@ -1148,6 +1218,8 @@ and is between 256 and 4096 characters. It is defined in the file register save and restore. The kernel will only save legacy floating-point registers on task switch. + noclflush [BUGS=X86] Don't use the CLFLUSH instruction + nohlt [BUGS=ARM] no-hlt [BUGS=X86-32] Tells the kernel that the hlt @@ -1492,14 +1564,17 @@ and is between 256 and 4096 characters. It is defined in the file ramdisk_size= [RAM] Sizes of RAM disks in kilobytes See Documentation/ramdisk.txt. - rcu.blimit= [KNL,BOOT] Set maximum number of finished - RCU callbacks to process in one batch. + rcupdate.blimit= [KNL,BOOT] + Set maximum number of finished RCU callbacks to process + in one batch. - rcu.qhimark= [KNL,BOOT] Set threshold of queued + rcupdate.qhimark= [KNL,BOOT] + Set threshold of queued RCU callbacks over which batch limiting is disabled. - rcu.qlowmark= [KNL,BOOT] Set threshold of queued - RCU callbacks below which batch limiting is re-enabled. + rcupdate.qlowmark= [KNL,BOOT] + Set threshold of queued RCU callbacks below which + batch limiting is re-enabled. rdinit= [KNL] Format: <full_path> @@ -1584,7 +1659,13 @@ and is between 256 and 4096 characters. It is defined in the file Format: <vendor>:<model>:<flags> (flags are integer value) - scsi_logging= [SCSI] + scsi_logging_level= [SCSI] a bit mask of logging levels + See drivers/scsi/scsi_logging.h for bits. Also + settable via sysctl at dev.scsi.logging_level + (/proc/sys/dev/scsi/logging_level). + There is also a nice 'scsi_logging_level' script in the + S390-tools package, available for download at + http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html scsi_mod.scan= [SCSI] sync (default) scans SCSI busses as they are discovered. async scans them in kernel threads, @@ -1813,9 +1894,6 @@ and is between 256 and 4096 characters. It is defined in the file st= [HW,SCSI] SCSI tape parameters (buffers, etc.) See Documentation/scsi/st.txt. - st0x= [HW,SCSI] - See header of drivers/scsi/seagate.c. - sti= [PARISC,HW] Format: <num> Set the STI (builtin display/keyboard on the HP-PARISC @@ -1900,9 +1978,6 @@ and is between 256 and 4096 characters. It is defined in the file tipar.delay= [HW,PPT] Set inter-bit delay in microseconds (default 10). - tmc8xx= [HW,SCSI] - See header of drivers/scsi/seagate.c. - tmscsim= [HW,SCSI] See comment before function dc390_setup() in drivers/scsi/tmscsim.c. @@ -1951,6 +2026,11 @@ and is between 256 and 4096 characters. It is defined in the file vdso=1: enable VDSO (default) vdso=0: disable VDSO mapping + vdso32= [X86-32,X86-64] + vdso32=2: enable compat VDSO (default with COMPAT_VDSO) + vdso32=1: enable 32-bit VDSO (default) + vdso32=0: disable 32-bit VDSO mapping + vector= [IA-64,SMP] vector=percpu: enable percpu vector domain diff --git a/Documentation/ko_KR/HOWTO b/Documentation/ko_KR/HOWTO index b51d7ca842ba..029fca914c05 100644 --- a/Documentation/ko_KR/HOWTO +++ b/Documentation/ko_KR/HOWTO @@ -1,6 +1,6 @@ NOTE: This is a version of Documentation/HOWTO translated into korean -This document is maintained by minchan Kim < minchan.kim@gmail.com> +This document is maintained by minchan Kim <minchan.kim@gmail.com> If you find any difference between this document and the original file or a problem with the translation, please contact the maintainer of this file. @@ -14,7 +14,7 @@ try to update the original English file first. Documentation/HOWTO 의 한글 번역입니다. -역자: 김민찬 <minchan.kim@gmail.com > +역자: 김민찬 <minchan.kim@gmail.com> 감수: 이제이미 <jamee.lee@samsung.com> ================================== @@ -23,11 +23,11 @@ Documentation/HOWTO 이 문서는 커널 개발에 있어 가장 중요한 문서이다. 이 문서는 리눅스 커널 개발자가 되는 법과 리눅스 커널 개발 커뮤니티와 일하는 -법을 담고있다. 커널 프로그래밍의기술적인 측면과 관련된 내용들은 -포함하지 않으려고 하였지만 올바으로 여러분을 안내하는 데 도움이 +법을 담고있다. 커널 프로그래밍의 기술적인 측면과 관련된 내용들은 +포함하지 않으려고 하였지만 올바른 길로 여러분을 안내하는 데는 도움이 될 것이다. -이 문서에서 오래된 것을 발견하면 문서의 아래쪽에 나열된 메인트너에게 +이 문서에서 오래된 것을 발견하면 문서의 아래쪽에 나열된 메인테이너에게 패치를 보내달라. @@ -36,12 +36,12 @@ Documentation/HOWTO 자, 여러분은 리눅스 커널 개발자가 되는 법을 배우고 싶은가? 아니면 상사로부터"이 장치를 위한 리눅스 드라이버를 작성하시오"라는 말을 -들었는가? 이 문서는 여러분이 겪게 될 과정과 커뮤니티와 일하는 법을 -조언하여 여러분의 목적을 달성하기 위해 필요한 것 모두를 알려주는 -것이다. +들었는가? 이 문서의 목적은 여러분이 겪게 될 과정과 커뮤니티와 협력하는 +법을 조언하여 여러분의 목적을 달성하기 위해 필요한 것 모두를 알려주기 +위함이다. -커널은 대부분은 C로 작성되었어고 몇몇 아키텍쳐의 의존적인 부분은 -어셈블리로 작성되었다. 커널 개발을 위해 C를 잘 이해하고 있어야 한다. +커널은 대부분은 C로 작성되어 있고 몇몇 아키텍쳐의 의존적인 부분은 +어셈블리로 작성되어 있다. 커널 개발을 위해 C를 잘 이해하고 있어야 한다. 여러분이 특정 아키텍쳐의 low-level 개발을 할 것이 아니라면 어셈블리(특정 아키텍쳐)는 잘 알아야 할 필요는 없다. 다음의 참고서적들은 기본에 충실한 C 교육이나 수년간의 경험에 견주지는 @@ -59,11 +59,11 @@ Documentation/HOWTO 어떤 참고문서도 있지 않다. 정보를 얻기 위해서는 gcc info (`info gcc`)페이지를 살펴보라. -여러분은 기존의 개발 커뮤니티와 일하는 법을 배우려고 하고 있다는 것을 -기억하라. 코딩, 스타일, 절차에 관한 훌륭한 표준을 가진 사람들이 모인 +여러분은 기존의 개발 커뮤니티와 협력하는 법을 배우려고 하고 있다는 것을 +기억하라. 코딩, 스타일, 함수에 관한 훌륭한 표준을 가진 사람들이 모인 다양한 그룹이 있다. 이 표준들은 오랜동안 크고 지역적으로 분산된 팀들에 -의해 가장 좋은 방법으로 일하기위하여 찾은 것을 기초로 만들어져왔다. -그 표준들은 문서화가 잘 되어 있기 때문에 가능한한 미리 많은 표준들에 +의해 가장 좋은 방법으로 일하기 위하여 찾은 것을 기초로 만들어져 왔다. +그 표준들은 문서화가 잘 되어있기 때문에 가능한한 미리 많은 표준들에 관하여 배우려고 시도하라. 다른 사람들은 여러분이나 여러분의 회사가 일하는 방식에 적응하는 것을 원하지는 않는다. @@ -73,7 +73,7 @@ Documentation/HOWTO 리눅스 커널 소스 코드는 GPL로 배포(release)되었다. 소스트리의 메인 디렉토리에 있는 라이센스에 관하여 상세하게 쓰여 있는 COPYING이라는 -파일을 봐라.여러분이 라이센스에 관한 더 깊은 문제를 가지고 있다면 +파일을 봐라. 여러분이 라이센스에 관한 더 깊은 문제를 가지고 있다면 리눅스 커널 메일링 리스트에 묻지말고 변호사와 연락하라. 메일링 리스트들에 있는 사람들은 변호사가 아니기 때문에 법적 문제에 관하여 그들의 말에 의지해서는 안된다. @@ -85,12 +85,12 @@ GPL에 관한 잦은 질문들과 답변들은 다음을 참조하라. 문서 ---- -리눅스 커널 소스 트리는 커널 커뮤니티와 일하는 법을 배우기 위한 많은 -귀중한 문서들을 가지고 있다. 새로운 기능들이 커널에 들어가게 될 때, +리눅스 커널 소스 트리는 커널 커뮤니티와 협력하는 법을 배우기위해 훌륭한 +다양한 문서들을 가지고 있다. 새로운 기능들이 커널에 들어가게 될 때, 그 기능을 어떻게 사용하는지에 관한 설명을 위하여 새로운 문서 파일을 추가하는 것을 권장한다. 커널이 유저스페이스로 노출하는 인터페이스를 변경하게 되면 변경을 설명하는 메뉴얼 페이지들에 대한 패치나 정보를 -mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. +mtk.manpages@gmail.com의 메인테이너에게 보낼 것을 권장한다. 다음은 커널 소스 트리에 있는 읽어야 할 파일들의 리스트이다. README @@ -105,7 +105,7 @@ mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. Documentation/CodingStyle 이 문서는 리눅스 커널 코딩 스타일과 그렇게 한 몇몇 이유를 설명한다. 모든 새로운 코드는 이 문서에 가이드라인들을 따라야 한다. 대부분의 - 메인트너들은 이 규칙을 따르는 패치들만을 받아들일 것이고 많은 사람들이 + 메인테이너들은 이 규칙을 따르는 패치들만을 받아들일 것이고 많은 사람들이 그 패치가 올바른 스타일일 경우만 코드를 검토할 것이다. Documentation/SubmittingPatches @@ -115,9 +115,10 @@ mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. - Email 내용들 - Email 양식 - 그것을 누구에게 보낼지 - 이러한 규칙들을 따르는 것이 성공을 보장하진 않는다(왜냐하면 모든 - 패치들은 내용과 스타일에 관하여 면밀히 검토되기 때문이다). - 그러나 규칙을 따르지 않는다면 거의 성공하지도 못할 것이다. + 이러한 규칙들을 따르는 것이 성공(역자주: 패치가 받아들여 지는 것)을 + 보장하진 않는다(왜냐하면 모든 패치들은 내용과 스타일에 관하여 + 면밀히 검토되기 때문이다). 그러나 규칙을 따르지 않는다면 거의 + 성공하지도 못할 것이다. 올바른 패치들을 만드는 법에 관한 훌륭한 다른 문서들이 있다. "The Perfect Patch" @@ -126,13 +127,13 @@ mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. http://linux.yyz.us/patch-format.html Documentation/stable_api_nonsense.txt - 이 문서는 의도적으로 커널이 변하지 않는 API를 갖지 않도록 결정한 + 이 문서는 의도적으로 커널이 불변하는 API를 갖지 않도록 결정한 이유를 설명하며 다음과 같은 것들을 포함한다. - 서브시스템 shim-layer(호환성을 위해?) - - 운영 체제들 간의 드라이버 이식성 + - 운영체제들간의 드라이버 이식성 - 커널 소스 트리내에 빠른 변화를 늦추는 것(또는 빠른 변화를 막는 것) 이 문서는 리눅스 개발 철학을 이해하는데 필수적이며 다른 운영체제에서 - 리눅스로 옮겨오는 사람들에게는 매우 중요하다. + 리눅스로 전향하는 사람들에게는 매우 중요하다. Documentation/SecurityBugs @@ -141,10 +142,10 @@ mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. 도와 달라. Documentation/ManagementStyle - 이 문서는 리눅스 커널 메인트너들이 어떻게 그들의 방법론의 정신을 - 어떻게 공유하고 운영하는지를 설명한다. 이것은 커널 개발에 입문하는 + 이 문서는 리눅스 커널 메인테이너들이 그들의 방법론에 녹아 있는 + 정신을 어떻게 공유하고 운영하는지를 설명한다. 이것은 커널 개발에 입문하는 모든 사람들(또는 커널 개발에 작은 호기심이라도 있는 사람들)이 - 읽어야 할 중요한 문서이다. 왜냐하면 이 문서는 커널 메인트너들의 + 읽어야 할 중요한 문서이다. 왜냐하면 이 문서는 커널 메인테이너들의 독특한 행동에 관하여 흔히 있는 오해들과 혼란들을 해소하고 있기 때문이다. @@ -160,7 +161,7 @@ mtk-manpages@gmx.net의 메인트너에게 보낼 것을 권장한다. Documentation/applying-patches.txt 패치가 무엇이며 그것을 커널의 다른 개발 브랜치들에 어떻게 - 적용하는지에 관하여 자세히 설명 하고 있는 좋은 입문서이다. + 적용하는지에 관하여 자세히 설명하고 있는 좋은 입문서이다. 커널은 소스 코드 그 자체에서 자동적으로 만들어질 수 있는 많은 문서들을 가지고 있다. 이것은 커널 내의 API에 대한 모든 설명, 그리고 락킹을 @@ -192,7 +193,7 @@ Documentation/DocBook/ 디렉토리 내에서 만들어지며 PDF, Postscript, H 여러분이 어디서 시작해야 할진 모르지만 커널 개발 커뮤니티에 참여할 수 있는 일들을 찾길 원한다면 리눅스 커널 Janitor 프로젝트를 살펴봐라. http://janitor.kernelnewbies.org/ -그곳은 시작하기에 아주 딱 좋은 곳이다. 그곳은 리눅스 커널 소스 트리내에 +그곳은 시작하기에 훌륭한 장소이다. 그곳은 리눅스 커널 소스 트리내에 간단히 정리되고 수정될 수 있는 문제들에 관하여 설명한다. 여러분은 이 프로젝트를 대표하는 개발자들과 일하면서 자신의 패치를 리눅스 커널 트리에 반영하기 위한 기본적인 것들을 배우게 될것이며 여러분이 아직 아이디어를 @@ -212,7 +213,7 @@ Documentation/DocBook/ 디렉토리 내에서 만들어지며 PDF, Postscript, H 것은 Linux Cross-Reference project이며 그것은 자기 참조 방식이며 소스코드를 인덱스된 웹 페이지들의 형태로 보여준다. 최신의 멋진 커널 코드 저장소는 다음을 통하여 참조할 수 있다. - http://sosdg.org/~coywolf/lxr/ + http://users.sosdg.org/~qiyong/lxr/ 개발 프로세스 @@ -233,44 +234,45 @@ Documentation/DocBook/ 디렉토리 내에서 만들어지며 PDF, Postscript, H 2.6.x 커널들은 Linux Torvalds가 관리하며 kernel.org의 pub/linux/kernel/v2.6/ 디렉토리에서 참조될 수 있다.개발 프로세스는 다음과 같다. - 새로운 커널이 배포되자마자 2주의 시간이 주어진다. 이 기간동은 - 메인트너들은 큰 diff들을 Linus에게 제출할 수 있다. 대개 이 패치들은 + 메인테이너들은 큰 diff들을 Linus에게 제출할 수 있다. 대개 이 패치들은 몇 주 동안 -mm 커널내에 이미 있었던 것들이다. 큰 변경들을 제출하는 데 선호되는 방법은 git(커널의 소스 관리 툴, 더 많은 정보들은 http://git.or.cz/ - 에서 참조할 수 있다)를 사용하는 것이지만 순수한 패치파일의 형식으로 보내도 + 에서 참조할 수 있다)를 사용하는 것이지만 순수한 패치파일의 형식으로 보내는 것도 무관하다. - 2주 후에 -rc1 커널이 배포되며 지금부터는 전체 커널의 안정성에 영향을 - 미칠수 있는 새로운 기능들을 포함하지 않는 패치들만을 추가될 수 있다. + 미칠수 있는 새로운 기능들을 포함하지 않는 패치들만이 추가될 수 있다. 완전히 새로운 드라이버(혹은 파일시스템)는 -rc1 이후에만 받아들여진다는 것을 기억해라. 왜냐하면 변경이 자체내에서만 발생하고 추가된 코드가 드라이버 외부의 다른 부분에는 영향을 주지 않으므로 그런 변경은 - 퇴보(regression)를 일으킬 만한 위험을 가지고 있지 않기 때문이다. -rc1이 + 회귀(역자주: 이전에는 존재하지 않았지만 새로운 기능추가나 변경으로 인해 + 생겨난 버그)를 일으킬 만한 위험을 가지고 있지 않기 때문이다. -rc1이 배포된 이후에 git를 사용하여 패치들을 Linus에게 보낼수 있지만 패치들은 공식적인 메일링 리스트로 보내서 검토를 받을 필요가 있다. - - 새로운 -rc는 Linus는 현재 git tree가 테스트 하기에 충분히 안정된 상태에 + - 새로운 -rc는 Linus가 현재 git tree가 테스트 하기에 충분히 안정된 상태에 있다고 판단될 때마다 배포된다. 목표는 새로운 -rc 커널을 매주 배포하는 것이다. - - 이러한 프로세스는 커널이 "준비"되었다고 여겨질때까지 계속된다. + - 이러한 프로세스는 커널이 "준비(ready)"되었다고 여겨질때까지 계속된다. 프로세스는 대체로 6주간 지속된다. - - 각 -rc 배포에 있는 알려진 퇴보의 목록들은 다음 URI에 남겨진다. + - 각 -rc 배포에 있는 알려진 회귀의 목록들은 다음 URI에 남겨진다. http://kernelnewbies.org/known_regressions 커널 배포에 있어서 언급할만한 가치가 있는 리눅스 커널 메일링 리스트의 Andrew Morton의 글이 있다. - "커널이 언제 배포될지는 아무로 모른다. 왜냐하면 배포는 알려진 + "커널이 언제 배포될지는 아무도 모른다. 왜냐하면 배포는 알려진 버그의 상황에 따라 배포되는 것이지 미리정해 놓은 시간에 따라 - 배포되는 것은 아니기 때문이다." + 배포되는 것은 아니기 때문이다." 2.6.x.y - 안정 커널 트리 ------------------------ 4 자리 숫자로 이루어진 버젼의 커널들은 -stable 커널들이다. 그것들은 2.6.x -커널에서 발견된 큰 퇴보들이나 보안 문제들 중 비교적 작고 중요한 수정들을 +커널에서 발견된 큰 회귀들이나 보안 문제들 중 비교적 작고 중요한 수정들을 포함한다. 이것은 가장 최근의 안정적인 커널을 원하는 사용자에게 추천되는 브랜치이며, -개발/실험적 버젼을 테스트하는 것을 돕는데는 별로 관심이 없다. +개발/실험적 버젼을 테스트하는 것을 돕고자 하는 사용자들과는 별로 관련이 없다. -어떤 2.6.x.y 커널도 사용가능하지 않다면 그때는 가장 높은 숫자의 2.6.x +어떤 2.6.x.y 커널도 사용할 수 없다면 그때는 가장 높은 숫자의 2.6.x 커널이 현재의 안정 커널이다. 2.6.x.y는 "stable" 팀<stable@kernel.org>에 의해 관리되며 거의 매번 격주로 @@ -294,7 +296,7 @@ Andrew Morton에 의해 배포된 실험적인 커널 패치들이다. Andrew는 서브시스템 커널 트리와 패치들을 가져와서 리눅스 커널 메일링 리스트로 온 많은 패치들과 한데 묶는다. 이 트리는 새로운 기능들과 패치들을 위한 장소를 제공하는 역할을 한다. 하나의 패치가 -mm에 한동안 있으면서 그 가치가 -증명되게 되면 Andrew나 서브시스템 메인트너는 그것을 메인라인에 포함시키기 +증명되게 되면 Andrew나 서브시스템 메인테이너는 그것을 메인라인에 포함시키기 위하여 Linus에게 보낸다. 커널 트리에 포함하고 싶은 모든 새로운 패치들은 Linus에게 보내지기 전에 @@ -327,7 +329,7 @@ Andrew Morton에 의해 배포된 실험적인 커널 패치들이다. Andrew는 - ACPI development tree, Len Brown <len.brown@intel.com > git.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git - - Block development tree, Jens Axboe <axboe@suse.de> + - Block development tree, Jens Axboe <jens.axboe@oracle.com> git.kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git - DRM development tree, Dave Airlie <airlied@linux.ie> @@ -367,8 +369,8 @@ bugzilla.kernel.org는 리눅스 커널 개발자들이 커널의 버그를 추 kernel bugzilla를 사용하는 자세한 방법은 다음을 참조하라. http://test.kernel.org/bugzilla/faq.html -메인 커널 소스 디렉토리에 있는 REPORTING-BUGS 파일은 커널 버그일 것 같은 -것을 보고하는는 법에 관한 좋은 템플릿이고 문제를 추적하기 위해서 커널 +메인 커널 소스 디렉토리에 있는 REPORTING-BUGS 파일은 커널 버그라고 생각되는 +것을 보고하는 방법에 관한 좋은 템플릿이며 문제를 추적하기 위해서 커널 개발자들이 필요로 하는 정보가 무엇들인지를 상세히 설명하고 있다. @@ -383,7 +385,7 @@ kernel bugzilla를 사용하는 자세한 방법은 다음을 참조하라. 점수를 얻을 수 있는 가장 좋은 방법중의 하나이다. 왜냐하면 많은 사람들은 다른 사람들의 버그들을 수정하기 위하여 시간을 낭비하지 않기 때문이다. -이미 보고된 버그 리포트들을 가지고 작업하기 위해서 http://bugzilla.kernelorg를 +이미 보고된 버그 리포트들을 가지고 작업하기 위해서 http://bugzilla.kernel.org를 참조하라. 여러분이 앞으로 생겨날 버그 리포트들의 조언자가 되길 원한다면 bugme-new 메일링 리스트나(새로운 버그 리포트들만이 이곳에서 메일로 전해진다) bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메일로 전해진다) @@ -404,8 +406,8 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 웹상의 많은 다른 곳에도 메일링 리스트의 아카이브들이 있다. 이러한 아카이브들을 찾으려면 검색 엔진을 사용하라. 예를 들어: http://dir.gmane.org/gmane.linux.kernel -여러분이 새로운 문제에 관해 리스트에 올리기 전에 말하고 싶은 주제에 대한 -것을 아카이브에서 먼저 찾기를 강력히 권장한다. 이미 상세하게 토론된 많은 +여러분이 새로운 문제에 관해 리스트에 올리기 전에 말하고 싶은 주제에 관한 +것을 아카이브에서 먼저 찾아보기를 강력히 권장한다. 이미 상세하게 토론된 많은 것들이 메일링 리스트의 아카이브에 기록되어 있다. 각각의 커널 서브시스템들의 대부분은 자신들의 개발에 관한 노력들로 이루어진 @@ -443,7 +445,7 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 무엇보다도 메일링 리스트의 다른 구독자들에게 보여주려 한다는 것을 기억하라. -커뮤니티와 일하는 법 +커뮤니티와 협력하는 법 -------------------- 커널 커뮤니티의 목적은 가능한한 가장 좋은 커널을 제공하는 것이다. 여러분이 @@ -474,7 +476,7 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 올바른 방향의 해결책으로 이끌어갈 의지가 있다면 받아들여질 것이라는 점을 기억하라. -여러분의 첫 패치에 여러분이 수정해야하는 십여개 정도의 회신이 오는 +여러분의 첫 패치에 여러분이 수정해야하는 십여개 정도의 회신이 오는 경우도 흔하다. 이것은 여러분의 패치가 받아들여지지 않을 것이라는 것을 의미하는 것이 아니고 개인적으로 여러분에게 감정이 있어서 그러는 것도 아니다. 간단히 여러분의 패치에 제기된 문제들을 수정하고 그것을 다시 @@ -486,12 +488,12 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 커널 커뮤니티는 가장 전통적인 회사의 개발 환경과는 다르다. 여기에 여러분들의 문제를 피하기 위한 목록이 있다. 여러분들이 제안한 변경들에 관하여 말할 때 좋은 것들 : - - " 이것은 여러 문제들을 해겹합니다." + - "이것은 여러 문제들을 해겹합니다." - "이것은 2000 라인의 코드를 제거합니다." - "이것은 내가 말하려는 것에 관해 설명하는 패치입니다." - "나는 5개의 다른 아키텍쳐에서 그것을 테스트했슴으로..." - - "여기에 일련의 작은 패치들이 있습음로..." - - "이것은 일반적인 머신에서 성능을 향상시키므로..." + - "여기에 일련의 작은 패치들이 있슴음로..." + - "이것은 일반적인 머신에서 성능을 향상시킴으로..." 여러분들이 말할 때 피해야 할 좋지 않은 것들 : - "우리를 그것을 AIT/ptx/Solaris에서 이러한 방법으로 했다. 그러므로 그것은 좋은 것임에 틀립없다..." @@ -500,7 +502,7 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 - "이것은 우리의 엔터프라이즈 상품 라인을 위한 것이다." - "여기에 나의 생각을 말하고 있는 1000 페이지 설계 문서가 있다." - "나는 6달동안 이것을 했으니..." - - "여기세 5000라인 짜리 패치가 있으니..." + - "여기에 5000라인 짜리 패치가 있으니..." - "나는 현재 뒤죽박죽인 것을 재작성했다. 그리고 여기에..." - "나는 마감시한을 가지고 있으므로 이 패치는 지금 적용될 필요가 있다." @@ -510,13 +512,13 @@ bugme-janitor 메일링 리스트(bugzilla에 모든 변화들이 여기서 메 없다는 것이다. 리눅스 커널의 작업 환경에서는 단지 이메일 주소만 알수 있기 때문에 여성과 소수 민족들도 모두 받아들여진다. 국제적으로 일하게 되는 측면은 사람의 이름에 근거하여 성별을 추측할 수 없게 -하기때문에 차별을 없애는 데 도움을 준다. Andrea라는 이름을 가진 남자와 +하기때문에 차별을 없애는 데 도움을 준다. Andrea라는 이름을 가진 남자와 Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅스 커널에서 작업하며 생각을 표현해왔던 대부분의 여성들은 긍정적인 경험을 가지고 있다. 언어 장벽은 영어에 익숙하지 않은 몇몇 사람들에게 문제가 될 수도 있다. - 언어의 훌륭한 구사는 메일링 리스트에서 올바르게 자신의 생각을 +언어의 훌륭한 구사는 메일링 리스트에서 올바르게 자신의 생각을 표현하기 위하여 필요하다. 그래서 여러분은 이메일을 보내기 전에 영어를 올바르게 사용하고 있는지를 체크하는 것이 바람직하다. @@ -524,13 +526,13 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 여러분의 변경을 나누어라 ------------------------ -리눅스 커널 커뮤니티는 한꺼번에 굉장히 큰 코드의 묶음을 쉽게 +리눅스 커널 커뮤니티는 한꺼번에 굉장히 큰 코드의 묶음(chunk)을 쉽게 받아들이지 않는다. 변경은 적절하게 소개되고, 검토되고, 각각의 부분으로 작게 나누어져야 한다. 이것은 회사에서 하는 것과는 정확히 반대되는 것이다. 여러분들의 제안은 개발 초기에 일찍이 소개되야 한다. 그래서 여러분들은 자신이 하고 있는 것에 관하여 피드백을 받을 수 있게 된다. 커뮤니티가 여러분들이 커뮤니티와 함께 일하고 있다는 것을 -느끼도록 만들고 커뮤니티가 여러분의 기능을 위한 쓰레기 장으로서 +느끼도록 만들고 커뮤니티가 여러분의 기능을 위한 쓰레기 장으로써 사용되지 않고 있다는 것을 느끼게 하자. 그러나 메일링 리스트에 한번에 50개의 이메일을 보내지는 말아라. 여러분들의 일련의 패치들은 항상 더 작아야 한다. @@ -539,7 +541,7 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 1) 작은 패치들은 여러분의 패치들이 적용될 수 있는 확률을 높여준다. 왜냐하면 다른 사람들은 정확성을 검증하기 위하여 많은 시간과 노력을 - 들이기를 원하지 않는다. 5줄의 패치는 메인트너가 거의 몇 초간 힐끗 + 들이기를 원하지 않는다. 5줄의 패치는 메인테이너가 거의 몇 초간 힐끗 보면 적용될 수 있다. 그러나 500 줄의 패치는 정확성을 검토하기 위하여 몇시간이 걸릴 수도 있다(걸리는 시간은 패치의 크기 혹은 다른 것에 비례하여 기하급수적으로 늘어난다). @@ -558,18 +560,18 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 간결하고 가장 뛰어난 답을 보길 원한다. 훌륭한 학생은 이것을 알고 마지막으로 답을 얻기 전 중간 과정들을 제출하진 않는다. - 커널 개발도 마찬가지이다. 메인트너들과 검토하는 사람들은 문제를 + 커널 개발도 마찬가지이다. 메인테이너들과 검토하는 사람들은 문제를 풀어나가는 과정속에 숨겨진 과정을 보길 원하진 않는다. 그들은 간결하고 멋진 답을 보길 원한다." -커뮤니티와 함께 일하며 뛰어난 답을 찾고 여러분들의 완성되지 않은 일들 -사이에 균형을 유지해야 하는 어려움이 있을 수 있다. 그러므로 프로세스의 -초반에 여러분의 일을 향상시키기위한 피드백을 얻는 것 뿐만 아니라 +커뮤니티와 협력하며 뛰어난 답을 찾는 것과 여러분들의 끝마치지 못한 작업들 +사이에 균형을 유지해야 하는 것은 어려울지도 모른다. 그러므로 프로세스의 +초반에 여러분의 작업을 향상시키기위한 피드백을 얻는 것 뿐만 아니라 여러분들의 변경들을 작은 묶음으로 유지해서 심지어는 여러분의 작업의 -모든 부분이 지금은 포함될 준비가 되어있지 않지만 작은 부분은 이미 +모든 부분이 지금은 포함될 준비가 되어있지 않지만 작은 부분은 벌써 받아들여질 수 있도록 유지하는 것이 바람직하다. -또한 완성되지 않았고 "나중에 수정될 것이다." 와 같은 것들은 포함하는 +또한 완성되지 않았고 "나중에 수정될 것이다." 와 같은 것들을 포함하는 패치들은 받아들여지지 않을 것이라는 점을 유념하라. 변경을 정당화해라 @@ -577,7 +579,7 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 여러분들의 나누어진 패치들을 리눅스 커뮤니티가 왜 반영해야 하는지를 알도록 하는 것은 매우 중요하다. 새로운 기능들이 필요하고 유용하다는 -것은 반드시 그에 맞는 이유가 있어야 한다. +것은 반드시 그에 합당한 이유가 있어야 한다. 변경을 문서화해라 @@ -588,7 +590,7 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 것이다. 그리고 항상 그 내용을 보길 원하는 모든 사람들을 위해 보존될 것이다. 패치는 완벽하게 다음과 같은 내용들을 포함하여 설명해야 한다. - 변경이 왜 필요한지 - - 패치에 관한 전체 설계 어프로치 + - 패치에 관한 전체 설계 접근(approach) - 구현 상세들 - 테스트 결과들 @@ -600,7 +602,7 @@ Pat이라는 이름을 가진 여자가 있을 수도 있는 것이다. 리눅 이 모든 것을 하는 것은 매우 어려운 일이다. 완벽히 소화하는 데는 적어도 몇년이 -걸릴 수도 있다. 많은 인내와 결의가 필요한 계속되는 개선의 과정이다. 그러나 +걸릴 수도 있다. 많은 인내와 결심이 필요한 계속되는 개선의 과정이다. 그러나 가능한한 포기하지 말라. 많은 사람들은 이전부터 해왔던 것이고 그 사람들도 정확하게 여러분들이 지금 서 있는 그 곳부터 시작했었다. @@ -620,4 +622,4 @@ David A. Wheeler, Junio Hamano, Michael Kerrisk, and Alex Shepard에게도 감 -메인트너: Greg Kroah-Hartman <greg@kroah.com> +메인테이너: Greg Kroah-Hartman <greg@kroah.com> diff --git a/Documentation/ko_KR/stable_api_nonsense.txt b/Documentation/ko_KR/stable_api_nonsense.txt new file mode 100644 index 000000000000..8f2b0e1d98c4 --- /dev/null +++ b/Documentation/ko_KR/stable_api_nonsense.txt @@ -0,0 +1,195 @@ +NOTE: +This is a version of Documentation/stable_api_nonsense.txt translated +into korean +This document is maintained by barrios <minchan.kim@gmail.com> +If you find any difference between this document and the original file or +a problem with the translation, please contact the maintainer of this file. + +Please also note that the purpose of this file is to be easier to +read for non English (read: korean) speakers and is not intended as +a fork. So if you have any comments or updates for this file please +try to update the original English file first. + +================================== +이 문서는 +Documentation/stable_api_nonsense.txt +의 한글 번역입니다. + +역자: 김민찬 <minchan.kim@gmail.com> +감수: 이제이미 <jamee.lee@samsung.com> +================================== + +리눅스 커널 드라이버 인터페이스 +(여러분들의 모든 질문에 대한 답 그리고 다른 몇가지) + +Greg Kroah-Hartman <greg@kroah.com> + +이 문서는 리눅스가 왜 바이너리 커널 인터페이스를 갖지 않는지, 왜 변하지 +않는(stable) 커널 인터페이스를 갖지 않는지를 설명하기 위해 쓰여졌다. +이 문서는 커널과 유저공간 사이의 인터페이스가 아니라 커널 내부의 +인터페이스들을 설명하고 있다는 것을 유념하라. 커널과 유저공간 사이의 +인터페이스는 응용프로그램이 사용하는 syscall 인터페이스이다. 그 인터페이스는 +오랫동안 거의 변하지 않았고 앞으로도 변하지 않을 것이다. 나는 pre 0.9에서 +만들어졌지만 최신의 2.6 커널 배포에서도 잘 동작하는 프로그램을 가지고 +있다. 이 인터페이스는 사용자와 응용프로그램 개발자들이 변하지 않을 것이라고 +여길수 있는 것이다. + + +초록 +---- +여러분은 변하지 않는 커널 인터페이스를 원한다고 생각하지만 실제로는 +그렇지 않으며 심지어는 그것을 알아채지 못한다. 여러분이 원하는 것은 +안정되게 실행되는 드라이버이며 드라이버가 메인 커널 트리에 있을 때 +그런 안정적인 드라이버를 얻을 수 있게 된다. 또한 여러분의 드라이버가 +메인 커널 트리에 있다면 다른 많은 좋은 이점들을 얻게 된다. 그러한 것들이 +리눅스를 강건하고, 안정적이며, 성숙한 운영체제로 만들어 놓음으로써 +여러분들로 하여금 바로 리눅스를 사용하게 만드는 이유이다. + + +소개 +---- + +커널 내부의 인터페이스가 바뀌는 것을 걱정하며 커널 드라이버를 작성하고 +싶어하는 사람은 정말 이상한 사람이다. 세상의 대다수의 사람들은 이 인터페이스를 +보지못할 것이며 전혀 걱정하지도 않는다. + +먼저, 나는 closed 소스, hidden 소스, binary blobs, 소스 wrappers, 또는 GPL로 +배포되었지만 소스 코드를 갖고 있지 않은 커널 드라이버들을 설명하는 어떤 다른 +용어들에 관한 어떤 법적인 문제에 관해서는 언급하지 않을 것이다. 어떤 법적인 +질문들을 가지고 있다면 변호사와 연락하라. 나는 프로그래머이므로 여기서 기술적인 +문제들만을 설명하려고 한다. (법적인 문제를 경시하는 것은 아니다. 그런 문제들은 +엄연히 현실에 있고 여러분들은 항상 그 문제들을 인식하고 있을 필요는 있다.) + +자, 두가지의 주요 주제가 있다. 바이너리 커널 인터페이스들과 변하지 않는 +커널 소스 인터페이들. 그것들은 서로 의존성을 가지고 있지만 바이너리 +문제를 먼저 풀고 넘어갈 것이다. + + + +바이너리 커널 인터페이스 +------------------------ +우리가 변하지 않는 커널 소스 인터페이스를 가지고 있다고 가정하자. 그러면 +바이너리 인터페이스 또한 자연적으로 변하지 않을까? 틀렸다. 리눅스 커널에 +관한 다음 사실들을 생각해보라. + - 여러분들이 사용하는 C 컴파일러의 버젼에 따라 다른 커널 자료 구조들은 + 다른 alignmnet들을 갖게 될것이고 다른 방법으로(함수들을 inline으로 + 했느냐, 아니냐) 다른 함수들을 포함하는 것도 가능한다. 중요한 것은 + 개별적인 함수 구성이 아니라 자료 구조 패딩이 달라진다는 점이다. + - 여러분이 선택한 커널 빌드 옵션에 따라서 커널은 다양한 것들을 가정할 + 수 있다. + - 다른 구조체들은 다른 필드들을 포함할 수 있다. + - 몇몇 함수들은 전혀 구현되지 않을 수도 있다(즉, 몇몇 lock들은 + non-SMP 빌드에서는 사라져 버릴수도 있다). + - 커널내에 메모리는 build optoin들에 따라 다른 방법으로 align될수 + 있다. + - 리눅스는 많은 다양한 프로세서 아키텍쳐에서 실행된다. 한 아키텍쳐의 + 바이너리 드라이버를 다른 아키텍쳐에서 정상적으로 실행시킬 방법은 + 없다. + +커널을 빌드했던 C 컴파일러와 정확하게 같은 것을 사용하고 정확하게 같은 +커널 구성(configuration)을 사용하여 여러분들의 모듈을 빌드하면 간단히 +많은 문제들을 해결할 수 있다. 이렇게 하는 것은 여러분들이 하나의 리눅스 +배포판의 하나의 배포 버젼을 위한 모듈만을 제공한다면 별일 아닐 것이다. +그러나 각기 다른 리눅스 배포판마다 한번씩 빌드하는 수를 각 리눅스 배포판마다 +제공하는 다른 릴리즈의 수와 곱하게 되면 이번에는 각 릴리즈들의 다른 빌드 +옵션의 악몽과 마주하게 것이다. 또한 각 리눅스 배포판들은 다른 하드웨어 +종류에(다른 프로세서 타입과 다른 옵션들) 맞춰져 있는 많은 다른 커널들을 +배포한다. 그러므로 한번의 배포에서조차 여러분들의 모듈은 여러 버젼을 +만들 필요가 있다. + +나를 믿어라. 여러분들은 이러한 종류의 배포를 지원하려고 시도한다면 시간이 +지나면 미칠지경이 될 것이다. 난 이러한 것을 오래전에 아주 어렵게 배웠다... + + + +변하지않는 커널 소스 인터페이스들 +--------------------------------- + +리눅스 커널 드라이버를 계속해서 메인 커널 트리에 반영하지 않고 +유지보수하려고 하는 사름들과 이 문제를 논의하게 되면 훨씬 더 +"논란의 여지가 많은" 주제가 될 것이다. + +리눅스 커널 개발은 끊임없이 빠른 속도로 이루어지고 있으며 결코 +느슨해진 적이 없다. 커널 개발자들이 현재 인터페이스들에서 버그를 +발견하거나 무엇인가 할수 있는 더 좋은 방법을 찾게 되었다고 하자. +그들이 발견한 것을 실행한다면 아마도 더 잘 동작하도록 현재 인터페이스들을 +수정하게 될 것이다. 그들이 그런 일을 하게되면 함수 이름들은 변하게 되고, +구조체들은 늘어나거나 줄어들게 되고, 함수 파라미터들은 재작업될 것이다. +이러한 일이 발생되면 커널 내에 이 인터페이스를 사용했던 인스턴스들이 동시에 +수정될 것이며 이러한 과정은 모든 것이 계속해서 올바르게 동작할 것이라는 +것을 보장한다. + +이러한 것의 한 예로써, 커널 내부의 USB 인터페이스들은 이 서브시스템이 +생긴 이후로 적어도 3번의 다른 재작업을 겪었다. 이 재작업들은 많은 다른 +문제들을 풀었다. + - 데이터 스트림들의 동기적인 모델에서 비동기적인 모델로의 변화. 이것은 + 많은 드라이버들의 복잡성을 줄이고 처리량을 향상시켜 현재는 거의 모든 + USB 장치들의 거의 최대 속도로 실행되고 있다. + - USB 드라이버가 USB 코어로부터 데이터 패킷들을 할당받로록 한 변경으로 + 인해서 지금의 모든 드라이버들은 많은 문서화된 데드락을 수정하기 위하여 + USB 코어에게 더 많은 정보를 제공해야만 한다. + +이것은 오랫동안 자신의 오래된 USB 인터페이스들을 유지해야 하는 closed 운영체제들과는 +완전히 반대되는 것이다. closed된 운영체제들은 새로운 개발자들에게 우연히 낡은 +인터페이스를 사용하게 할 기회를 주게되며, 적절하지 못한 방법으로 처리하게 되어 +운영체제의 안정성을 해치는 문제를 야기하게 된다. + +이 두가지의 예들 모두, 모든 개발자들은 꼭 이루어져야 하는 중요한 변화들이라고 +동의를 하였고 비교적 적은 고통으로 변경되어졌다. 리눅스가 변하지 않는 소스 +인터페이스를 고집한다면, 새로운 인터페이스가 만들어지게 되며 반면 기존의 오래된 +것들, 그리고 깨진 것들은 계속해서 유지되어야 하며 이러한 일들은 USB 개발자들에게 +또 다른 일거리를 주게 된다. 모든 리눅스 USB 개발자들에게 자신의 그들의 업무를 +마친 후 시간을 투자하여 아무 득도 없는 무료 봉사를 해달라고 하는 것은 가능성이 +희박한 일이다. + +보안 문제 역시 리눅스에게는 매우 중요하다. 보안 문제가 발견되면 그것은 +매우 짧은 시간 안에 수정된다. 보안 문제는 그 문제를 해결하기 위하여 +여러번 내부 커널 인터페이스들을 재작업하게 만들었다. 이러한 문제가 +발생하였을 때 그 인터페이스들을 사용하는 모든 드라이버들도 동시에 +수정되어 보안 문제가 앞으로 갑작스럽게 생기지는 않을 것이라는 것을 +보장한다. 내부 인터페이스들의 변경이 허락되지 않으면 이러한 종류의 보안 +문제를 수정하고 그것이 다시 발생하지 않을 것이라고 보장하는 것은 가능하지 +않을 것이다. + +커널 인터페이스들은 계속해서 정리되고 있다. 현재 인터페이스를 사용하는 +사람이 한명도 없다면 그것은 삭제된다. 이것은 커널이 가능한한 가장 작게 +유지되며 존재하는 모든 가능성이 있는 인터페이스들이 테스트된다는 것을 +보장한다(사용되지 않는 인터페이스들은 유효성 검증을 하기가 거의 불가능하다). + + +무엇을 해야 하나 +--------------- +자, 여러분이 메인 커널 트리에 있지 않은 리눅스 커널 드라이버를 가지고 +있다면 여러분은 즉, 개발자는 무엇을 해야 하나? 모든 배포판마다 다른 +커널 버젼을 위한 바이너리 드라이버를 배포하는 것은 악몽이며 계속해서 +변하고 있는 커널 인터페이스들의 맞처 유지보수하려고 시도하는 것은 힘든 +일이다. + +간단하다. 여러분의 커널 드라이버를 메인 커널 트리에 반영하라(우리는 여기서 +GPL을 따르는 배포 드라이버에 관해 얘기하고 있다는 것을 상기하라. 여러분의 +코드가 이러한 분류에 해당되지 않는다면 행운을 빈다. 여러분 스스로 어떻게든 +해야만 한다). 여러분의 드라이버가 트리에 있게되면 커널 인터페이스가 +변경되더라도 가장 먼저 커널에 변경을 가했던 사람에 의해서 수정될 것이다. +이것은 여러분의 드라이버가 여러분의 별다른 노력없이 항상 빌드가 가능하며 +동작하는 것을 보장한다. + +메인 커널 트리에 여러분의 드라이버를 반영하면 얻게 되는 장점들은 다음과 같다. + - 관리의 드는 비용(원래 개발자의)은 줄어줄면서 드라이버의 질은 향상될 것이다. + - 다른 개발자들이 여러분의 드라이버에 기능들을 추가 할 것이다. + - 다른 사람들은 여러분의 드라이버에 버그를 발견하고 수정할 것이다. + - 다른 사람들은 여러분의 드라이버의 개선점을 찾을 줄 것이다. + - 외부 인터페이스 변경으로 인해 여러분의 드라이버의 수정이 필요하다면 다른 + 사람들이 드라이버를 업데이트할 것이다. + - 여러분의 드라이버는 별다른 노력 없이 모든 리눅스 배포판에 자동적으로 + 추가될 것이다. + +리눅스는 다른 운영 체제보다 "쉽게 쓸수 있는(out of the box)" 많은 다른 장치들을 +지원하고 어떤 다른 운영 체제보다 다양한 아키텍쳐위에서 이러한 장치들을 지원하기 때문에 +이러한 증명된 개발 모델은 틀림없이 바로 가고 있는 것이다. + + + +------ + +이 문서의 초안을 검토해주고 코멘트 해준 Randy Dunlap, Andrew Morton, David Brownell, +Hanna Linder, Robert Love, 그리고 Nishanth Aravamudan에게 감사한다. diff --git a/Documentation/kobject.txt b/Documentation/kobject.txt index ca86a885ad8f..bf3256e04027 100644 --- a/Documentation/kobject.txt +++ b/Documentation/kobject.txt @@ -1,289 +1,386 @@ -The kobject Infrastructure +Everything you never wanted to know about kobjects, ksets, and ktypes -Patrick Mochel <mochel@osdl.org> +Greg Kroah-Hartman <gregkh@suse.de> -Updated: 3 June 2003 +Based on an original article by Jon Corbet for lwn.net written October 1, +2003 and located at http://lwn.net/Articles/51437/ +Last updated December 19, 2007 -Copyright (c) 2003 Patrick Mochel -Copyright (c) 2003 Open Source Development Labs +Part of the difficulty in understanding the driver model - and the kobject +abstraction upon which it is built - is that there is no obvious starting +place. Dealing with kobjects requires understanding a few different types, +all of which make reference to each other. In an attempt to make things +easier, we'll take a multi-pass approach, starting with vague terms and +adding detail as we go. To that end, here are some quick definitions of +some terms we will be working with. -0. Introduction + - A kobject is an object of type struct kobject. Kobjects have a name + and a reference count. A kobject also has a parent pointer (allowing + objects to be arranged into hierarchies), a specific type, and, + usually, a representation in the sysfs virtual filesystem. -The kobject infrastructure performs basic object management that larger -data structures and subsystems can leverage, rather than reimplement -similar functionality. This functionality primarily concerns: + Kobjects are generally not interesting on their own; instead, they are + usually embedded within some other structure which contains the stuff + the code is really interested in. -- Object reference counting. -- Maintaining lists (sets) of objects. -- Object set locking. -- Userspace representation. + No structure should EVER have more than one kobject embedded within it. + If it does, the reference counting for the object is sure to be messed + up and incorrect, and your code will be buggy. So do not do this. -The infrastructure consists of a number of object types to support -this functionality. Their programming interfaces are described below -in detail, and briefly here: + - A ktype is the type of object that embeds a kobject. Every structure + that embeds a kobject needs a corresponding ktype. The ktype controls + what happens to the kobject when it is created and destroyed. -- kobjects a simple object. -- kset a set of objects of a certain type. -- ktype a set of helpers for objects of a common type. + - A kset is a group of kobjects. These kobjects can be of the same ktype + or belong to different ktypes. The kset is the basic container type for + collections of kobjects. Ksets contain their own kobjects, but you can + safely ignore that implementation detail as the kset core code handles + this kobject automatically. + When you see a sysfs directory full of other directories, generally each + of those directories corresponds to a kobject in the same kset. -The kobject infrastructure maintains a close relationship with the -sysfs filesystem. Each kobject that is registered with the kobject -core receives a directory in sysfs. Attributes about the kobject can -then be exported. Please see Documentation/filesystems/sysfs.txt for -more information. +We'll look at how to create and manipulate all of these types. A bottom-up +approach will be taken, so we'll go back to kobjects. -The kobject infrastructure provides a flexible programming interface, -and allows kobjects and ksets to be used without being registered -(i.e. with no sysfs representation). This is also described later. +Embedding kobjects -1. kobjects +It is rare for kernel code to create a standalone kobject, with one major +exception explained below. Instead, kobjects are used to control access to +a larger, domain-specific object. To this end, kobjects will be found +embedded in other structures. If you are used to thinking of things in +object-oriented terms, kobjects can be seen as a top-level, abstract class +from which other classes are derived. A kobject implements a set of +capabilities which are not particularly useful by themselves, but which are +nice to have in other objects. The C language does not allow for the +direct expression of inheritance, so other techniques - such as structure +embedding - must be used. -1.1 Description +So, for example, the UIO code has a structure that defines the memory +region associated with a uio device: +struct uio_mem { + struct kobject kobj; + unsigned long addr; + unsigned long size; + int memtype; + void __iomem *internal_addr; +}; -struct kobject is a simple data type that provides a foundation for -more complex object types. It provides a set of basic fields that -almost all complex data types share. kobjects are intended to be -embedded in larger data structures and replace fields they duplicate. +If you have a struct uio_mem structure, finding its embedded kobject is +just a matter of using the kobj member. Code that works with kobjects will +often have the opposite problem, however: given a struct kobject pointer, +what is the pointer to the containing structure? You must avoid tricks +(such as assuming that the kobject is at the beginning of the structure) +and, instead, use the container_of() macro, found in <linux/kernel.h>: -1.2 Definition + container_of(pointer, type, member) -struct kobject { - const char * k_name; - struct kref kref; - struct list_head entry; - struct kobject * parent; - struct kset * kset; - struct kobj_type * ktype; - struct sysfs_dirent * sd; - wait_queue_head_t poll; -}; +where pointer is the pointer to the embedded kobject, type is the type of +the containing structure, and member is the name of the structure field to +which pointer points. The return value from container_of() is a pointer to +the given type. So, for example, a pointer "kp" to a struct kobject +embedded within a struct uio_mem could be converted to a pointer to the +containing uio_mem structure with: -void kobject_init(struct kobject *); -int kobject_add(struct kobject *); -int kobject_register(struct kobject *); + struct uio_mem *u_mem = container_of(kp, struct uio_mem, kobj); -void kobject_del(struct kobject *); -void kobject_unregister(struct kobject *); +Programmers often define a simple macro for "back-casting" kobject pointers +to the containing type. -struct kobject * kobject_get(struct kobject *); -void kobject_put(struct kobject *); +Initialization of kobjects -1.3 kobject Programming Interface +Code which creates a kobject must, of course, initialize that object. Some +of the internal fields are setup with a (mandatory) call to kobject_init(): -kobjects may be dynamically added and removed from the kobject core -using kobject_register() and kobject_unregister(). Registration -includes inserting the kobject in the list of its dominant kset and -creating a directory for it in sysfs. + void kobject_init(struct kobject *kobj, struct kobj_type *ktype); -Alternatively, one may use a kobject without adding it to its kset's list -or exporting it via sysfs, by simply calling kobject_init(). An -initialized kobject may later be added to the object hierarchy by -calling kobject_add(). An initialized kobject may be used for -reference counting. +The ktype is required for a kobject to be created properly, as every kobject +must have an associated kobj_type. After calling kobject_init(), to +register the kobject with sysfs, the function kobject_add() must be called: -Note: calling kobject_init() then kobject_add() is functionally -equivalent to calling kobject_register(). + int kobject_add(struct kobject *kobj, struct kobject *parent, const char *fmt, ...); -When a kobject is unregistered, it is removed from its kset's list, -removed from the sysfs filesystem, and its reference count is decremented. -List and sysfs removal happen in kobject_del(), and may be called -manually. kobject_put() decrements the reference count, and may also -be called manually. +This sets up the parent of the kobject and the name for the kobject +properly. If the kobject is to be associated with a specific kset, +kobj->kset must be assigned before calling kobject_add(). If a kset is +associated with a kobject, then the parent for the kobject can be set to +NULL in the call to kobject_add() and then the kobject's parent will be the +kset itself. -A kobject's reference count may be incremented with kobject_get(), -which returns a valid reference to a kobject; and decremented with -kobject_put(). An object's reference count may only be incremented if -it is already positive. +As the name of the kobject is set when it is added to the kernel, the name +of the kobject should never be manipulated directly. If you must change +the name of the kobject, call kobject_rename(): -When a kobject's reference count reaches 0, the method struct -kobj_type::release() (which the kobject's kset points to) is called. -This allows any memory allocated for the object to be freed. + int kobject_rename(struct kobject *kobj, const char *new_name); +There is a function called kobject_set_name() but that is legacy cruft and +is being removed. If your code needs to call this function, it is +incorrect and needs to be fixed. -NOTE!!! +To properly access the name of the kobject, use the function +kobject_name(): -It is _imperative_ that you supply a destructor for dynamically -allocated kobjects to free them if you are using kobject reference -counts. The reference count controls the lifetime of the object. -If it goes to 0, then it is assumed that the object will -be freed and cannot be used. + const char *kobject_name(const struct kobject * kobj); -More importantly, you must free the object there, and not immediately -after an unregister call. If someone else is referencing the object -(e.g. through a sysfs file), they will obtain a reference to the -object, assume it's valid and operate on it. If the object is -unregistered and freed in the meantime, the operation will then -reference freed memory and go boom. +There is a helper function to both initialize and add the kobject to the +kernel at the same time, called supprisingly enough kobject_init_and_add(): -This can be prevented, in the simplest case, by defining a release -method and freeing the object from there only. Note that this will not -secure reference count/object management models that use a dual -reference count or do other wacky things with the reference count -(like the networking layer). + int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype, + struct kobject *parent, const char *fmt, ...); +The arguments are the same as the individual kobject_init() and +kobject_add() functions described above. -1.4 sysfs -Each kobject receives a directory in sysfs. This directory is created -under the kobject's parent directory. +Uevents -If a kobject does not have a parent when it is registered, its parent -becomes its dominant kset. +After a kobject has been registered with the kobject core, you need to +announce to the world that it has been created. This can be done with a +call to kobject_uevent(): -If a kobject does not have a parent nor a dominant kset, its directory -is created at the top-level of the sysfs partition. + int kobject_uevent(struct kobject *kobj, enum kobject_action action); +Use the KOBJ_ADD action for when the kobject is first added to the kernel. +This should be done only after any attributes or children of the kobject +have been initialized properly, as userspace will instantly start to look +for them when this call happens. +When the kobject is removed from the kernel (details on how to do that is +below), the uevent for KOBJ_REMOVE will be automatically created by the +kobject core, so the caller does not have to worry about doing that by +hand. -2. ksets -2.1 Description +Reference counts -A kset is a set of kobjects that are embedded in the same type. +One of the key functions of a kobject is to serve as a reference counter +for the object in which it is embedded. As long as references to the object +exist, the object (and the code which supports it) must continue to exist. +The low-level functions for manipulating a kobject's reference counts are: + struct kobject *kobject_get(struct kobject *kobj); + void kobject_put(struct kobject *kobj); -struct kset { - struct kobj_type * ktype; - struct list_head list; - struct kobject kobj; - struct kset_uevent_ops * uevent_ops; -}; +A successful call to kobject_get() will increment the kobject's reference +counter and return the pointer to the kobject. +When a reference is released, the call to kobject_put() will decrement the +reference count and, possibly, free the object. Note that kobject_init() +sets the reference count to one, so the code which sets up the kobject will +need to do a kobject_put() eventually to release that reference. -void kset_init(struct kset * k); -int kset_add(struct kset * k); -int kset_register(struct kset * k); -void kset_unregister(struct kset * k); +Because kobjects are dynamic, they must not be declared statically or on +the stack, but instead, always allocated dynamically. Future versions of +the kernel will contain a run-time check for kobjects that are created +statically and will warn the developer of this improper usage. -struct kset * kset_get(struct kset * k); -void kset_put(struct kset * k); +If all that you want to use a kobject for is to provide a reference counter +for your structure, please use the struct kref instead; a kobject would be +overkill. For more information on how to use struct kref, please see the +file Documentation/kref.txt in the Linux kernel source tree. -struct kobject * kset_find_obj(struct kset *, char *); +Creating "simple" kobjects -The type that the kobjects are embedded in is described by the ktype -pointer. +Sometimes all that a developer wants is a way to create a simple directory +in the sysfs hierarchy, and not have to mess with the whole complication of +ksets, show and store functions, and other details. This is the one +exception where a single kobject should be created. To create such an +entry, use the function: -A kset contains a kobject itself, meaning that it may be registered in -the kobject hierarchy and exported via sysfs. More importantly, the -kset may be embedded in a larger data type, and may be part of another -kset (of that object type). + struct kobject *kobject_create_and_add(char *name, struct kobject *parent); -For example, a block device is an object (struct gendisk) that is -contained in a set of block devices. It may also contain a set of -partitions (struct hd_struct) that have been found on the device. The -following code snippet illustrates how to express this properly. +This function will create a kobject and place it in sysfs in the location +underneath the specified parent kobject. To create simple attributes +associated with this kobject, use: - struct gendisk * disk; - ... - disk->kset.kobj.kset = &block_kset; - disk->kset.ktype = &partition_ktype; - kset_register(&disk->kset); + int sysfs_create_file(struct kobject *kobj, struct attribute *attr); +or + int sysfs_create_group(struct kobject *kobj, struct attribute_group *grp); -- The kset that the disk's embedded object belongs to is the - block_kset, and is pointed to by disk->kset.kobj.kset. +Both types of attributes used here, with a kobject that has been created +with the kobject_create_and_add(), can be of type kobj_attribute, so no +special custom attribute is needed to be created. -- The type of objects on the disk's _subordinate_ list are partitions, - and is set in disk->kset.ktype. +See the example module, samples/kobject/kobject-example.c for an +implementation of a simple kobject and attributes. -- The kset is then registered, which handles initializing and adding - the embedded kobject to the hierarchy. -2.2 kset Programming Interface +ktypes and release methods -All kset functions, except kset_find_obj(), eventually forward the -calls to their embedded kobjects after performing kset-specific -operations. ksets offer a similar programming model to kobjects: they -may be used after they are initialized, without registering them in -the hierarchy. +One important thing still missing from the discussion is what happens to a +kobject when its reference count reaches zero. The code which created the +kobject generally does not know when that will happen; if it did, there +would be little point in using a kobject in the first place. Even +predictable object lifecycles become more complicated when sysfs is brought +in as other portions of the kernel can get a reference on any kobject that +is registered in the system. -kset_find_obj() may be used to locate a kobject with a particular -name. The kobject, if found, is returned. +The end result is that a structure protected by a kobject cannot be freed +before its reference count goes to zero. The reference count is not under +the direct control of the code which created the kobject. So that code must +be notified asynchronously whenever the last reference to one of its +kobjects goes away. -There are also some helper functions which names point to the formerly -existing "struct subsystem", whose functions have been taken over by -ksets. +Once you registered your kobject via kobject_add(), you must never use +kfree() to free it directly. The only safe way is to use kobject_put(). It +is good practice to always use kobject_put() after kobject_init() to avoid +errors creeping in. +This notification is done through a kobject's release() method. Usually +such a method has a form like: -decl_subsys(name,type,uevent_ops) + void my_object_release(struct kobject *kobj) + { + struct my_object *mine = container_of(kobj, struct my_object, kobj); -Declares a kset named '<name>_subsys' of type <type> with -uevent_ops <uevent_ops>. For example, + /* Perform any additional cleanup on this object, then... */ + kfree(mine); + } -decl_subsys(devices, &ktype_device, &device_uevent_ops); +One important point cannot be overstated: every kobject must have a +release() method, and the kobject must persist (in a consistent state) +until that method is called. If these constraints are not met, the code is +flawed. Note that the kernel will warn you if you forget to provide a +release() method. Do not try to get rid of this warning by providing an +"empty" release function; you will be mocked mercilessly by the kobject +maintainer if you attempt this. -is equivalent to doing: +Note, the name of the kobject is available in the release function, but it +must NOT be changed within this callback. Otherwise there will be a memory +leak in the kobject core, which makes people unhappy. -struct kset devices_subsys = { - .ktype = &ktype_devices, - .uevent_ops = &device_uevent_ops, -}; -kobject_set_name(&devices_subsys, name); +Interestingly, the release() method is not stored in the kobject itself; +instead, it is associated with the ktype. So let us introduce struct +kobj_type: + + struct kobj_type { + void (*release)(struct kobject *); + struct sysfs_ops *sysfs_ops; + struct attribute **default_attrs; + }; -The objects that are registered with a subsystem that use the -subsystem's default list must have their kset ptr set properly. These -objects may have embedded kobjects or ksets. The -following helper makes setting the kset easier: +This structure is used to describe a particular type of kobject (or, more +correctly, of containing object). Every kobject needs to have an associated +kobj_type structure; a pointer to that structure must be specified when you +call kobject_init() or kobject_init_and_add(). +The release field in struct kobj_type is, of course, a pointer to the +release() method for this type of kobject. The other two fields (sysfs_ops +and default_attrs) control how objects of this type are represented in +sysfs; they are beyond the scope of this document. -kobj_set_kset_s(obj,subsys) +The default_attrs pointer is a list of default attributes that will be +automatically created for any kobject that is registered with this ktype. -- Assumes that obj->kobj exists, and is a struct kobject. -- Sets the kset of that kobject to the kset <subsys>. -int subsystem_register(struct kset *s); -void subsystem_unregister(struct kset *s); +ksets -These are just wrappers around the respective kset_* functions. +A kset is merely a collection of kobjects that want to be associated with +each other. There is no restriction that they be of the same ktype, but be +very careful if they are not. -2.3 sysfs +A kset serves these functions: -ksets are represented in sysfs when their embedded kobjects are -registered. They follow the same rules of parenting, with one -exception. If a kset does not have a parent, nor is its embedded -kobject part of another kset, the kset's parent becomes its dominant -subsystem. + - It serves as a bag containing a group of objects. A kset can be used by + the kernel to track "all block devices" or "all PCI device drivers." -If the kset does not have a parent, its directory is created at the -sysfs root. This should only happen when the kset registered is -embedded in a subsystem itself. + - A kset is also a subdirectory in sysfs, where the associated kobjects + with the kset can show up. Every kset contains a kobject which can be + set up to be the parent of other kobjects; the top-level directories of + the sysfs hierarchy are constructed in this way. + - Ksets can support the "hotplugging" of kobjects and influence how + uevent events are reported to user space. -3. struct ktype +In object-oriented terms, "kset" is the top-level container class; ksets +contain their own kobject, but that kobject is managed by the kset code and +should not be manipulated by any other user. -3.1. Description +A kset keeps its children in a standard kernel linked list. Kobjects point +back to their containing kset via their kset field. In almost all cases, +the kobjects belonging to a ket have that kset (or, strictly, its embedded +kobject) in their parent. -struct kobj_type { - void (*release)(struct kobject *); - struct sysfs_ops * sysfs_ops; - struct attribute ** default_attrs; +As a kset contains a kobject within it, it should always be dynamically +created and never declared statically or on the stack. To create a new +kset use: + struct kset *kset_create_and_add(const char *name, + struct kset_uevent_ops *u, + struct kobject *parent); + +When you are finished with the kset, call: + void kset_unregister(struct kset *kset); +to destroy it. + +An example of using a kset can be seen in the +samples/kobject/kset-example.c file in the kernel tree. + +If a kset wishes to control the uevent operations of the kobjects +associated with it, it can use the struct kset_uevent_ops to handle it: + +struct kset_uevent_ops { + int (*filter)(struct kset *kset, struct kobject *kobj); + const char *(*name)(struct kset *kset, struct kobject *kobj); + int (*uevent)(struct kset *kset, struct kobject *kobj, + struct kobj_uevent_env *env); }; -Object types require specific functions for converting between the -generic object and the more complex type. struct kobj_type provides -the object-specific fields, which include: +The filter function allows a kset to prevent a uevent from being emitted to +userspace for a specific kobject. If the function returns 0, the uevent +will not be emitted. + +The name function will be called to override the default name of the kset +that the uevent sends to userspace. By default, the name will be the same +as the kset itself, but this function, if present, can override that name. + +The uevent function will be called when the uevent is about to be sent to +userspace to allow more environment variables to be added to the uevent. + +One might ask how, exactly, a kobject is added to a kset, given that no +functions which perform that function have been presented. The answer is +that this task is handled by kobject_add(). When a kobject is passed to +kobject_add(), its kset member should point to the kset to which the +kobject will belong. kobject_add() will handle the rest. + +If the kobject belonging to a kset has no parent kobject set, it will be +added to the kset's directory. Not all members of a kset do necessarily +live in the kset directory. If an explicit parent kobject is assigned +before the kobject is added, the kobject is registered with the kset, but +added below the parent kobject. + + +Kobject removal -- release: Called when the kobject's reference count reaches 0. This - should convert the object to the more complex type and free it. +After a kobject has been registered with the kobject core successfully, it +must be cleaned up when the code is finished with it. To do that, call +kobject_put(). By doing this, the kobject core will automatically clean up +all of the memory allocated by this kobject. If a KOBJ_ADD uevent has been +sent for the object, a corresponding KOBJ_REMOVE uevent will be sent, and +any other sysfs housekeeping will be handled for the caller properly. -- sysfs_ops: Provides conversion functions for sysfs access. Please - see the sysfs documentation for more information. +If you need to do a two-stage delete of the kobject (say you are not +allowed to sleep when you need to destroy the object), then call +kobject_del() which will unregister the kobject from sysfs. This makes the +kobject "invisible", but it is not cleaned up, and the reference count of +the object is still the same. At a later time call kobject_put() to finish +the cleanup of the memory associated with the kobject. -- default_attrs: Default attributes to be exported via sysfs when the - object is registered.Note that the last attribute has to be - initialized to NULL ! You can find a complete implementation - in block/genhd.c +kobject_del() can be used to drop the reference to the parent object, if +circular references are constructed. It is valid in some cases, that a +parent objects references a child. Circular references _must_ be broken +with an explicit call to kobject_del(), so that a release functions will be +called, and the objects in the former circle release each other. -Instances of struct kobj_type are not registered; only referenced by -the kset. A kobj_type may be referenced by an arbitrary number of -ksets, as there may be disparate sets of identical objects. +Example code to copy from +For a more complete example of using ksets and kobjects properly, see the +sample/kobject/kset-example.c code. diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index cb12ae175aa2..30c101761d0d 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for an i386 fastcall function). The jprobe will work in either case, so long as the handler's prototype matches that of the probed function. -1.3 How Does a Return Probe Work? +1.3 Return Probes + +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -107,9 +109,9 @@ At boot time, Kprobes registers a kprobe at the trampoline. When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +handler calls the user-specified return handler associated with the +kretprobe, then sets the saved instruction pointer to the saved return +address, and that's where execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -131,6 +133,30 @@ zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following @@ -141,6 +167,7 @@ architectures: - ppc64 - ia64 (Does not support probes on instruction slot1.) - sparc64 (Return probes not yet implemented.) +- arm 3. Configuring Kprobes @@ -273,6 +300,8 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see "Kretprobe + entry-handler" for details. The regs_return_value(regs) macro provides a simple abstraction to extract the return value from the appropriate register as defined by @@ -555,23 +584,52 @@ report failed calls to sys_open(). #include <linux/kernel.h> #include <linux/module.h> #include <linux/kprobes.h> +#include <linux/ktime.h> + +/* per-instance private data */ +struct my_data { + ktime_t entry_stamp; +}; static const char *probed_func = "sys_open"; -/* Return-probe handler: If the probed function fails, log the return value. */ -static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +/* Timestamp function entry. */ +static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + struct my_data *data; + + if(!current->mm) + return 1; /* skip kernel threads */ + + data = (struct my_data *)ri->data; + data->entry_stamp = ktime_get(); + return 0; +} + +/* If the probed function failed, log the return value and duration. + * Duration may turn out to be zero consistently, depending upon the + * granularity of time accounting on the platform. */ +static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { int retval = regs_return_value(regs); + struct my_data *data = (struct my_data *)ri->data; + s64 delta; + ktime_t now; + if (retval < 0) { - printk("%s returns %d\n", probed_func, retval); + now = ktime_get(); + delta = ktime_to_ns(ktime_sub(now, data->entry_stamp)); + printk("%s: return val = %d (duration = %lld ns)\n", + probed_func, retval, delta); } return 0; } static struct kretprobe my_kretprobe = { - .handler = ret_handler, - /* Probe up to 20 instances concurrently. */ - .maxactive = 20 + .handler = return_handler, + .entry_handler = entry_handler, + .data_size = sizeof(struct my_data), + .maxactive = 20, /* probe up to 20 instances concurrently */ }; static int __init kretprobe_init(void) @@ -583,7 +641,7 @@ static int __init kretprobe_init(void) printk("register_kretprobe failed, returned %d\n", ret); return -1; } - printk("Planted return probe at %p\n", my_kretprobe.kp.addr); + printk("Kretprobe active on %s\n", my_kretprobe.kp.symbol_name); return 0; } @@ -593,7 +651,7 @@ static void __exit kretprobe_exit(void) printk("kretprobe unregistered\n"); /* nmissed > 0 suggests that maxactive was set too low. */ printk("Missed probing %d instances of %s\n", - my_kretprobe.nmissed, probed_func); + my_kretprobe.nmissed, probed_func); } module_init(kretprobe_init) diff --git a/Documentation/kref.txt b/Documentation/kref.txt index f38b59d00c63..130b6e87aa7e 100644 --- a/Documentation/kref.txt +++ b/Documentation/kref.txt @@ -141,10 +141,10 @@ The last rule (rule 3) is the nastiest one to handle. Say, for instance, you have a list of items that are each kref-ed, and you wish to get the first one. You can't just pull the first item off the list and kref_get() it. That violates rule 3 because you are not already -holding a valid pointer. You must add locks or semaphores. For -instance: +holding a valid pointer. You must add a mutex (or some other lock). +For instance: -static DECLARE_MUTEX(sem); +static DEFINE_MUTEX(mutex); static LIST_HEAD(q); struct my_data { @@ -155,12 +155,12 @@ struct my_data static struct my_data *get_entry() { struct my_data *entry = NULL; - down(&sem); + mutex_lock(&mutex); if (!list_empty(&q)) { entry = container_of(q.next, struct my_q_entry, link); kref_get(&entry->refcount); } - up(&sem); + mutex_unlock(&mutex); return entry; } @@ -174,9 +174,9 @@ static void release_entry(struct kref *ref) static void put_entry(struct my_data *entry) { - down(&sem); + mutex_lock(&mutex); kref_put(&entry->refcount, release_entry); - up(&sem); + mutex_unlock(&mutex); } The kref_put() return value is useful if you do not want to hold the @@ -191,13 +191,13 @@ static void release_entry(struct kref *ref) static void put_entry(struct my_data *entry) { - down(&sem); + mutex_lock(&mutex); if (kref_put(&entry->refcount, release_entry)) { list_del(&entry->link); - up(&sem); + mutex_unlock(&mutex); kfree(entry); } else - up(&sem); + mutex_unlock(&mutex); } This is really more useful if you have to call other routines as part diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 42008395534d..0f23d67f958f 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -34,6 +34,8 @@ #include <zlib.h> #include <assert.h> #include <sched.h> +#include <limits.h> +#include <stddef.h> #include "linux/lguest_launcher.h" #include "linux/virtio_config.h" #include "linux/virtio_net.h" @@ -79,6 +81,9 @@ static void *guest_base; /* The maximum guest physical address allowed, and maximum possible. */ static unsigned long guest_limit, guest_max; +/* a per-cpu variable indicating whose vcpu is currently running */ +static unsigned int __thread cpu_id; + /* This is our list of devices. */ struct device_list { @@ -96,13 +101,11 @@ struct device_list /* The descriptor page for the devices. */ u8 *descpage; - /* The tail of the last descriptor. */ - unsigned int desc_used; - /* A single linked list of devices. */ struct device *dev; - /* ... And an end pointer so we can easily append new devices */ - struct device **lastdev; + /* And a pointer to the last device for easy append and also for + * configuration appending. */ + struct device *lastdev; }; /* The list of Guest devices, based on command line arguments. */ @@ -153,6 +156,9 @@ struct virtqueue void (*handle_output)(int fd, struct virtqueue *me); }; +/* Remember the arguments to the program so we can "reboot" */ +static char **main_args; + /* Since guest is UP and we don't run at the same time, we don't need barriers. * But I include them in the code in case others copy it. */ #define wmb() @@ -185,7 +191,14 @@ static void *_convert(struct iovec *iov, size_t size, size_t align, #define cpu_to_le64(v64) (v64) #define le16_to_cpu(v16) (v16) #define le32_to_cpu(v32) (v32) -#define le64_to_cpu(v32) (v64) +#define le64_to_cpu(v64) (v64) + +/* The device virtqueue descriptors are followed by feature bitmasks. */ +static u8 *get_feature_bits(struct device *dev) +{ + return (u8 *)(dev->desc + 1) + + dev->desc->num_vq * sizeof(struct lguest_vqconfig); +} /*L:100 The Launcher code itself takes us out into userspace, that scary place * where pointers run wild and free! Unfortunately, like most userspace @@ -554,7 +567,7 @@ static void wake_parent(int pipefd, int lguest_fd) else FD_CLR(-fd - 1, &devices.infds); } else /* Send LHREQ_BREAK command. */ - write(lguest_fd, args, sizeof(args)); + pwrite(lguest_fd, args, sizeof(args), cpu_id); } } @@ -908,21 +921,58 @@ static void enable_fd(int fd, struct virtqueue *vq) write(waker_fd, &vq->dev->fd, sizeof(vq->dev->fd)); } +/* Resetting a device is fairly easy. */ +static void reset_device(struct device *dev) +{ + struct virtqueue *vq; + + verbose("Resetting device %s\n", dev->name); + /* Clear the status. */ + dev->desc->status = 0; + + /* Clear any features they've acked. */ + memset(get_feature_bits(dev) + dev->desc->feature_len, 0, + dev->desc->feature_len); + + /* Zero out the virtqueues. */ + for (vq = dev->vq; vq; vq = vq->next) { + memset(vq->vring.desc, 0, + vring_size(vq->config.num, getpagesize())); + vq->last_avail_idx = 0; + } +} + /* This is the generic routine we call when the Guest uses LHCALL_NOTIFY. */ static void handle_output(int fd, unsigned long addr) { struct device *i; struct virtqueue *vq; - /* Check each virtqueue. */ + /* Check each device and virtqueue. */ for (i = devices.dev; i; i = i->next) { + /* Notifications to device descriptors reset the device. */ + if (from_guest_phys(addr) == i->desc) { + reset_device(i); + return; + } + + /* Notifications to virtqueues mean output has occurred. */ for (vq = i->vq; vq; vq = vq->next) { - if (vq->config.pfn == addr/getpagesize() - && vq->handle_output) { - verbose("Output to %s\n", vq->dev->name); - vq->handle_output(fd, vq); + if (vq->config.pfn != addr/getpagesize()) + continue; + + /* Guest should acknowledge (and set features!) before + * using the device. */ + if (i->desc->status == 0) { + warnx("%s gave early output", i->name); return; } + + if (strcmp(vq->dev->name, "console") != 0) + verbose("Output to %s\n", vq->dev->name); + if (vq->handle_output) + vq->handle_output(fd, vq); + return; } } @@ -980,54 +1030,44 @@ static void handle_input(int fd) * * All devices need a descriptor so the Guest knows it exists, and a "struct * device" so the Launcher can keep track of it. We have common helper - * routines to allocate them. - * - * This routine allocates a new "struct lguest_device_desc" from descriptor - * table just above the Guest's normal memory. It returns a pointer to that - * descriptor. */ -static struct lguest_device_desc *new_dev_desc(u16 type) -{ - struct lguest_device_desc *d; - - /* We only have one page for all the descriptors. */ - if (devices.desc_used + sizeof(*d) > getpagesize()) - errx(1, "Too many devices"); + * routines to allocate and manage them. */ - /* We don't need to set config_len or status: page is 0 already. */ - d = (void *)devices.descpage + devices.desc_used; - d->type = type; - devices.desc_used += sizeof(*d); - - return d; +/* The layout of the device page is a "struct lguest_device_desc" followed by a + * number of virtqueue descriptors, then two sets of feature bits, then an + * array of configuration bytes. This routine returns the configuration + * pointer. */ +static u8 *device_config(const struct device *dev) +{ + return (void *)(dev->desc + 1) + + dev->desc->num_vq * sizeof(struct lguest_vqconfig) + + dev->desc->feature_len * 2; } -/* Each device descriptor is followed by some configuration information. - * Each configuration field looks like: u8 type, u8 len, [... len bytes...]. - * - * This routine adds a new field to an existing device's descriptor. It only - * works for the last device, but that's OK because that's how we use it. */ -static void add_desc_field(struct device *dev, u8 type, u8 len, const void *c) +/* This routine allocates a new "struct lguest_device_desc" from descriptor + * table page just above the Guest's normal memory. It returns a pointer to + * that descriptor. */ +static struct lguest_device_desc *new_dev_desc(u16 type) { - /* This is the last descriptor, right? */ - assert(devices.descpage + devices.desc_used - == (u8 *)(dev->desc + 1) + dev->desc->config_len); + struct lguest_device_desc d = { .type = type }; + void *p; - /* We only have one page of device descriptions. */ - if (devices.desc_used + 2 + len > getpagesize()) - errx(1, "Too many devices"); + /* Figure out where the next device config is, based on the last one. */ + if (devices.lastdev) + p = device_config(devices.lastdev) + + devices.lastdev->desc->config_len; + else + p = devices.descpage; - /* Copy in the new config header: type then length. */ - devices.descpage[devices.desc_used++] = type; - devices.descpage[devices.desc_used++] = len; - memcpy(devices.descpage + devices.desc_used, c, len); - devices.desc_used += len; + /* We only have one page for all the descriptors. */ + if (p + sizeof(d) > (void *)devices.descpage + getpagesize()) + errx(1, "Too many devices"); - /* Update the device descriptor length: two byte head then data. */ - dev->desc->config_len += 2 + len; + /* p might not be aligned, so we memcpy in. */ + return memcpy(p, &d, sizeof(d)); } -/* This routine adds a virtqueue to a device. We specify how many descriptors - * the virtqueue is to have. */ +/* Each device descriptor is followed by the description of its virtqueues. We + * specify how many descriptors the virtqueue is to have. */ static void add_virtqueue(struct device *dev, unsigned int num_descs, void (*handle_output)(int fd, struct virtqueue *me)) { @@ -1040,6 +1080,11 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, / getpagesize(); p = get_pages(pages); + /* Initialize the virtqueue */ + vq->next = NULL; + vq->last_avail_idx = 0; + vq->dev = dev; + /* Initialize the configuration. */ vq->config.num = num_descs; vq->config.irq = devices.next_irq++; @@ -1048,27 +1093,60 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, /* Initialize the vring. */ vring_init(&vq->vring, num_descs, p, getpagesize()); - /* Add the configuration information to this device's descriptor. */ - add_desc_field(dev, VIRTIO_CONFIG_F_VIRTQUEUE, - sizeof(vq->config), &vq->config); + /* Append virtqueue to this device's descriptor. We use + * device_config() to get the end of the device's current virtqueues; + * we check that we haven't added any config or feature information + * yet, otherwise we'd be overwriting them. */ + assert(dev->desc->config_len == 0 && dev->desc->feature_len == 0); + memcpy(device_config(dev), &vq->config, sizeof(vq->config)); + dev->desc->num_vq++; + + verbose("Virtqueue page %#lx\n", to_guest_phys(p)); /* Add to tail of list, so dev->vq is first vq, dev->vq->next is * second. */ for (i = &dev->vq; *i; i = &(*i)->next); *i = vq; - /* Link virtqueue back to device. */ - vq->dev = dev; - /* Set the routine to call when the Guest does something to this * virtqueue. */ vq->handle_output = handle_output; - /* Set the "Don't Notify Me" flag if we don't have a handler */ + /* As an optimization, set the advisory "Don't Notify Me" flag if we + * don't have a handler */ if (!handle_output) vq->vring.used->flags = VRING_USED_F_NO_NOTIFY; } +/* The first half of the feature bitmask is for us to advertise features. The + * second half if for the Guest to accept features. */ +static void add_feature(struct device *dev, unsigned bit) +{ + u8 *features = get_feature_bits(dev); + + /* We can't extend the feature bits once we've added config bytes */ + if (dev->desc->feature_len <= bit / CHAR_BIT) { + assert(dev->desc->config_len == 0); + dev->desc->feature_len = (bit / CHAR_BIT) + 1; + } + + features[bit / CHAR_BIT] |= (1 << (bit % CHAR_BIT)); +} + +/* This routine sets the configuration fields for an existing device's + * descriptor. It only works for the last device, but that's OK because that's + * how we use it. */ +static void set_config(struct device *dev, unsigned len, const void *conf) +{ + /* Check we haven't overflowed our single page. */ + if (device_config(dev) + len > devices.descpage + getpagesize()) + errx(1, "Too many devices"); + + /* Copy in the config information, and store the length. */ + memcpy(device_config(dev), conf, len); + dev->desc->config_len = len; +} + /* This routine does all the creation and setup of a new device, including * calling new_dev_desc() to allocate the descriptor and device memory. */ static struct device *new_device(const char *name, u16 type, int fd, @@ -1076,14 +1154,6 @@ static struct device *new_device(const char *name, u16 type, int fd, { struct device *dev = malloc(sizeof(*dev)); - /* Append to device list. Prepending to a single-linked list is - * easier, but the user expects the devices to be arranged on the bus - * in command-line order. The first network device on the command line - * is eth0, the first block device /dev/vda, etc. */ - *devices.lastdev = dev; - dev->next = NULL; - devices.lastdev = &dev->next; - /* Now we populate the fields one at a time. */ dev->fd = fd; /* If we have an input handler for this file descriptor, then we add it @@ -1093,6 +1163,18 @@ static struct device *new_device(const char *name, u16 type, int fd, dev->desc = new_dev_desc(type); dev->handle_input = handle_input; dev->name = name; + dev->vq = NULL; + + /* Append to device list. Prepending to a single-linked list is + * easier, but the user expects the devices to be arranged on the bus + * in command-line order. The first network device on the command line + * is eth0, the first block device /dev/vda, etc. */ + if (devices.lastdev) + devices.lastdev->next = dev; + else + devices.dev = dev; + devices.lastdev = dev; + return dev; } @@ -1217,7 +1299,7 @@ static void setup_tun_net(const char *arg) int netfd, ipfd; u32 ip; const char *br_name = NULL; - u8 hwaddr[6]; + struct virtio_net_config conf; /* We open the /dev/net/tun device and tell it we want a tap device. A * tap device is like a tun device, only somehow different. To tell @@ -1256,12 +1338,13 @@ static void setup_tun_net(const char *arg) ip = str2ip(arg); /* Set up the tun device, and get the mac address for the interface. */ - configure_device(ipfd, ifr.ifr_name, ip, hwaddr); + configure_device(ipfd, ifr.ifr_name, ip, conf.mac); /* Tell Guest what MAC address to use. */ - add_desc_field(dev, VIRTIO_CONFIG_NET_MAC_F, sizeof(hwaddr), hwaddr); + add_feature(dev, VIRTIO_NET_F_MAC); + set_config(dev, sizeof(conf), &conf); - /* We don't seed the socket any more; setup is done. */ + /* We don't need the socket any more; setup is done. */ close(ipfd); verbose("device %u: tun net %u.%u.%u.%u\n", @@ -1449,8 +1532,7 @@ static void setup_block_file(const char *filename) struct device *dev; struct vblk_info *vblk; void *stack; - u64 cap; - unsigned int val; + struct virtio_blk_config conf; /* This is the pipe the I/O thread will use to tell us I/O is done. */ pipe(p); @@ -1468,14 +1550,18 @@ static void setup_block_file(const char *filename) vblk->fd = open_or_die(filename, O_RDWR|O_LARGEFILE); vblk->len = lseek64(vblk->fd, 0, SEEK_END); + /* We support barriers. */ + add_feature(dev, VIRTIO_BLK_F_BARRIER); + /* Tell Guest how many sectors this device has. */ - cap = cpu_to_le64(vblk->len / 512); - add_desc_field(dev, VIRTIO_CONFIG_BLK_F_CAPACITY, sizeof(cap), &cap); + conf.capacity = cpu_to_le64(vblk->len / 512); /* Tell Guest not to put in too many descriptors at once: two are used * for the in and out elements. */ - val = cpu_to_le32(VIRTQUEUE_NUM - 2); - add_desc_field(dev, VIRTIO_CONFIG_BLK_F_SEG_MAX, sizeof(val), &val); + add_feature(dev, VIRTIO_BLK_F_SEG_MAX); + conf.seg_max = cpu_to_le32(VIRTQUEUE_NUM - 2); + + set_config(dev, sizeof(conf), &conf); /* The I/O thread writes to this end of the pipe when done. */ vblk->done_fd = p[1]; @@ -1486,7 +1572,9 @@ static void setup_block_file(const char *filename) /* Create stack for thread and run it */ stack = malloc(32768); - if (clone(io_thread, stack + 32768, CLONE_VM, dev) == -1) + /* SIGCHLD - We dont "wait" for our cloned thread, so prevent it from + * becoming a zombie. */ + if (clone(io_thread, stack + 32768, CLONE_VM | SIGCHLD, dev) == -1) err(1, "Creating clone"); /* We don't need to keep the I/O thread's end of the pipes open. */ @@ -1494,9 +1582,23 @@ static void setup_block_file(const char *filename) close(vblk->workpipe[0]); verbose("device %u: virtblock %llu sectors\n", - devices.device_num, cap); + devices.device_num, le64_to_cpu(conf.capacity)); +} +/* That's the end of device setup. :*/ + +/* Reboot */ +static void __attribute__((noreturn)) restart_guest(void) +{ + unsigned int i; + + /* Closing pipes causes the waker thread and io_threads to die, and + * closing /dev/lguest cleans up the Guest. Since we don't track all + * open fds, we simply close everything beyond stderr. */ + for (i = 3; i < FD_SETSIZE; i++) + close(i); + execv(main_args[0], main_args); + err(1, "Could not exec %s", main_args[0]); } -/* That's the end of device setup. */ /*L:220 Finally we reach the core of the Launcher, which runs the Guest, serves * its input and output, and finally, lays it to rest. */ @@ -1508,7 +1610,8 @@ static void __attribute__((noreturn)) run_guest(int lguest_fd) int readval; /* We read from the /dev/lguest device to run the Guest. */ - readval = read(lguest_fd, ¬ify_addr, sizeof(notify_addr)); + readval = pread(lguest_fd, ¬ify_addr, + sizeof(notify_addr), cpu_id); /* One unsigned long means the Guest did HCALL_NOTIFY */ if (readval == sizeof(notify_addr)) { @@ -1518,16 +1621,23 @@ static void __attribute__((noreturn)) run_guest(int lguest_fd) /* ENOENT means the Guest died. Reading tells us why. */ } else if (errno == ENOENT) { char reason[1024] = { 0 }; - read(lguest_fd, reason, sizeof(reason)-1); + pread(lguest_fd, reason, sizeof(reason)-1, cpu_id); errx(1, "%s", reason); + /* ERESTART means that we need to reboot the guest */ + } else if (errno == ERESTART) { + restart_guest(); /* EAGAIN means the Waker wanted us to look at some input. * Anything else means a bug or incompatible change. */ } else if (errno != EAGAIN) err(1, "Running guest failed"); + /* Only service input on thread for CPU 0. */ + if (cpu_id != 0) + continue; + /* Service input, then unset the BREAK to release the Waker. */ handle_input(lguest_fd); - if (write(lguest_fd, args, sizeof(args)) < 0) + if (pwrite(lguest_fd, args, sizeof(args), cpu_id) < 0) err(1, "Resetting break"); } } @@ -1568,17 +1678,24 @@ int main(int argc, char *argv[]) /* If they specify an initrd file to load. */ const char *initrd_name = NULL; + /* Save the args: we "reboot" by execing ourselves again. */ + main_args = argv; + /* We don't "wait" for the children, so prevent them from becoming + * zombies. */ + signal(SIGCHLD, SIG_IGN); + /* First we initialize the device list. Since console and network * device receive input from a file descriptor, we keep an fdset * (infds) and the maximum fd number (max_infd) with the head of the - * list. We also keep a pointer to the last device, for easy appending - * to the list. Finally, we keep the next interrupt number to hand out - * (1: remember that 0 is used by the timer). */ + * list. We also keep a pointer to the last device. Finally, we keep + * the next interrupt number to hand out (1: remember that 0 is used by + * the timer). */ FD_ZERO(&devices.infds); devices.max_infd = -1; - devices.lastdev = &devices.dev; + devices.lastdev = NULL; devices.next_irq = 1; + cpu_id = 0; /* We need to know how much memory so we can set up the device * descriptor and memory pages for the devices as we parse the command * line. So we quickly look through the arguments to find the amount diff --git a/Documentation/lguest/lguest.txt b/Documentation/lguest/lguest.txt index 7885ab2d5f53..722d4e7fbebe 100644 --- a/Documentation/lguest/lguest.txt +++ b/Documentation/lguest/lguest.txt @@ -109,10 +109,6 @@ Running Lguest: See http://linux-net.osdl.org/index.php/Bridge for general information on how to get bridging working. -- You can also create an inter-guest network using - "--sharenet=<filename>": any two guests using the same file are on - the same network. This file is created if it does not exist. - There is a helpful mailing list at http://ozlabs.org/mailman/listinfo/lguest Good luck! diff --git a/Documentation/local_ops.txt b/Documentation/local_ops.txt index 1a45f11e645e..4269a1105b37 100644 --- a/Documentation/local_ops.txt +++ b/Documentation/local_ops.txt @@ -68,29 +68,6 @@ typedef struct { atomic_long_t a; } local_t; variable can be read when reading some _other_ cpu's variables. -* Rules to follow when using local atomic operations - -- Variables touched by local ops must be per cpu variables. -- _Only_ the CPU owner of these variables must write to them. -- This CPU can use local ops from any context (process, irq, softirq, nmi, ...) - to update its local_t variables. -- Preemption (or interrupts) must be disabled when using local ops in - process context to make sure the process won't be migrated to a - different CPU between getting the per-cpu variable and doing the - actual local op. -- When using local ops in interrupt context, no special care must be - taken on a mainline kernel, since they will run on the local CPU with - preemption already disabled. I suggest, however, to explicitly - disable preemption anyway to make sure it will still work correctly on - -rt kernels. -- Reading the local cpu variable will provide the current copy of the - variable. -- Reads of these variables can be done from any CPU, because updates to - "long", aligned, variables are always atomic. Since no memory - synchronization is done by the writer CPU, an outdated copy of the - variable can be read when reading some _other_ cpu's variables. - - * How to use local atomic operations #include <linux/percpu.h> diff --git a/Documentation/m68k/kernel-options.txt b/Documentation/m68k/kernel-options.txt index 248589e8bcf5..c93bed66e25d 100644 --- a/Documentation/m68k/kernel-options.txt +++ b/Documentation/m68k/kernel-options.txt @@ -867,66 +867,6 @@ controller and should be autodetected by the driver. An example is the 24 bit region which is specified by a mask of 0x00fffffe. -5.5) 53c7xx= ------------- - -Syntax: 53c7xx=<sub-options...> - -These options affect the A4000T, A4091, WarpEngine, Blizzard 603e+, -and GForce 040/060 SCSI controllers on the Amiga, as well as the -builtin MVME 16x SCSI controller. - -The <sub-options> is a comma-separated list of the sub-options listed -below. - -5.5.1) nosync -------------- - -Syntax: nosync:0 - - Disables sync negotiation for all devices. Any value after the - colon is acceptable (and has the same effect). - -5.5.2) noasync --------------- - -[OBSOLETE, REMOVED] - -5.5.3) nodisconnect -------------------- - -Syntax: nodisconnect:0 - - Disables SCSI disconnects. Any value after the colon is acceptable - (and has the same effect). - -5.5.4) validids ---------------- - -Syntax: validids:0xNN - - Specify which SCSI ids the driver should pay attention to. This is - a bitmask (i.e. to only pay attention to ID#4, you'd use 0x10). - Default is 0x7f (devices 0-6). - -5.5.5) opthi -5.5.6) optlo ------------- - -Syntax: opthi:M,optlo:N - - Specify options for "hostdata->options". The acceptable definitions - are listed in drivers/scsi/53c7xx.h; the 32 high bits should be in - opthi and the 32 low bits in optlo. They must be specified in the - order opthi=M,optlo=N. - -5.5.7) next ------------ - - No argument. Used to separate blocks of keywords when there's more - than one 53c7xx host adapter in the system. - - /* Local Variables: */ /* mode: text */ /* End: */ diff --git a/Documentation/md.txt b/Documentation/md.txt index 5818628207b5..396cdd982c26 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -416,6 +416,16 @@ also have sectors in total that could need to be processed. The two numbers are separated by a '/' thus effectively showing one value, a fraction of the process that is complete. + A 'select' on this attribute will return when resync completes, + when it reaches the current sync_max (below) and possibly at + other times. + + sync_max + This is a number of sectors at which point a resync/recovery + process will pause. When a resync is active, the value can + only ever be increased, never decreased. The value of 'max' + effectively disables the limit. + sync_speed This shows the current actual speed, in K/sec, of the current diff --git a/Documentation/mips/00-INDEX b/Documentation/mips/00-INDEX index 3f13bf8043d2..8ae9cffc2262 100644 --- a/Documentation/mips/00-INDEX +++ b/Documentation/mips/00-INDEX @@ -2,5 +2,3 @@ - this file. AU1xxx_IDE.README - README for MIPS AU1XXX IDE driver. -GT64120.README - - README for dir with info on MIPS boards using GT-64120 or GT-64120A. diff --git a/Documentation/mips/GT64120.README b/Documentation/mips/GT64120.README deleted file mode 100644 index 2d0eec91dc59..000000000000 --- a/Documentation/mips/GT64120.README +++ /dev/null @@ -1,65 +0,0 @@ -README for arch/mips/gt64120 directory and subdirectories - -Jun Sun, jsun@mvista.com or jsun@junsun.net -01/27, 2001 - -MOTIVATION ----------- - -Many MIPS boards share the same system controller (or CPU companian chip), -such as GT-64120. It is highly desirable to let these boards share -the same controller code instead of duplicating them. - -This directory is meant to hold all MIPS boards that use GT-64120 or GT-64120A. - - -HOW TO ADD A BOARD ------------------- - -. Create a subdirectory include/asm/gt64120/<board>. - -. Create a file called gt64120_dep.h under that directory. - -. Modify include/asm/gt64120/gt64120.h file to include the new gt64120_dep.h - based on config options. The board-dep section is at the end of - include/asm/gt64120/gt64120.h file. There you can find all required - definitions include/asm/gt64120/<board>/gt64120_dep.h file must supply. - -. Create a subdirectory arch/mips/gt64120/<board> directory to hold - board specific routines. - -. The GT-64120 common code is supplied under arch/mips/gt64120/common directory. - It includes: - 1) arch/mips/gt64120/pci.c - - common PCI routine, include the top-level pcibios_init() - 2) arch/mips/gt64120/irq.c - - common IRQ routine, include the top-level do_IRQ() - [This part really belongs to arch/mips/kernel. jsun] - 3) arch/mips/gt64120/gt_irq.c - - common IRQ routines for GT-64120 chip. Currently it only handles - the timer interrupt. - -. Board-specific routines are supplied under arch/mips/gt64120/<board> dir. - 1) arch/mips/gt64120/<board>/pci.c - it provides bus fixup routine - 2) arch/mips/gt64120/<board>/irq.c - it provides enable/disable irqs - and board irq setup routine (irq_setup) - 3) arch/mips/gt64120/<board>/int-handler.S - - The first-level interrupt dispatching routine. - 4) a bunch of other "normal" stuff (setup, prom, dbg_io, reset, etc) - -. Follow other "normal" procedure to modify configuration files, etc. - - -TO-DO LIST ----------- - -. Expand arch/mips/gt64120/gt_irq.c to handle all GT-64120 interrupts. - We probably need to introduce GT_IRQ_BASE in board-dep header file, - which is used the starting irq_nr for all GT irqs. - - A function, gt64120_handle_irq(), will be added so that the first-level - irq dispatcher will call this function if it detects an interrupt - from GT-64120. - -. More support for GT-64120 PCI features (2nd PCI bus, perhaps) - diff --git a/Documentation/namespaces/compatibility-list.txt b/Documentation/namespaces/compatibility-list.txt new file mode 100644 index 000000000000..defc5589bfcd --- /dev/null +++ b/Documentation/namespaces/compatibility-list.txt @@ -0,0 +1,39 @@ + Namespaces compatibility list + +This document contains the information about the problems user +may have when creating tasks living in different namespaces. + +Here's the summary. This matrix shows the known problems, that +occur when tasks share some namespace (the columns) while living +in different other namespaces (the rows): + + UTS IPC VFS PID User Net +UTS X +IPC X 1 +VFS X +PID 1 1 X +User 2 2 X +Net X + +1. Both the IPC and the PID namespaces provide IDs to address + object inside the kernel. E.g. semaphore with IPCID or + process group with pid. + + In both cases, tasks shouldn't try exposing this ID to some + other task living in a different namespace via a shared filesystem + or IPC shmem/message. The fact is that this ID is only valid + within the namespace it was obtained in and may refer to some + other object in another namespace. + +2. Intentionally, two equal user IDs in different user namespaces + should not be equal from the VFS point of view. In other + words, user 10 in one user namespace shouldn't have the same + access permissions to files, belonging to user 10 in another + namespace. + + The same is true for the IPC namespaces being shared - two users + from different user namespaces should not access the same IPC objects + even having equal UIDs. + + But currently this is not so. + diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 563e442f2d42..02e56d447a8f 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -24,6 +24,8 @@ baycom.txt - info on the driver for Baycom style amateur radio modems bridge.txt - where to get user space programs for ethernet bridging with Linux. +can.txt + - documentation on CAN protocol family. cops.txt - info on the COPS LocalTalk Linux driver cs89x0.txt @@ -82,8 +84,6 @@ policy-routing.txt - IP policy-based routing ray_cs.txt - Raylink Wireless LAN card driver info. -shaper.txt - - info on the module that can shape/limit transmitted traffic. sk98lin.txt - Marvell Yukon Chipset / SysKonnect SK-98xx compliant Gigabit Ethernet Adapter family driver info diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 11340625e363..a0cda062bc33 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -1,7 +1,7 @@ Linux Ethernet Bonding Driver HOWTO - Latest update: 24 April 2006 + Latest update: 12 November 2007 Initial release : Thomas Davis <tadavis at lbl.gov> Corrections, HA extensions : 2000/10/03-15 : @@ -166,12 +166,17 @@ to use ifenslave. 2. Bonding Driver Options ========================= - Options for the bonding driver are supplied as parameters to -the bonding module at load time. They may be given as command line -arguments to the insmod or modprobe command, but are usually specified -in either the /etc/modules.conf or /etc/modprobe.conf configuration -file, or in a distro-specific configuration file (some of which are -detailed in the next section). + Options for the bonding driver are supplied as parameters to the +bonding module at load time, or are specified via sysfs. + + Module options may be given as command line arguments to the +insmod or modprobe command, but are usually specified in either the +/etc/modules.conf or /etc/modprobe.conf configuration file, or in a +distro-specific configuration file (some of which are detailed in the next +section). + + Details on bonding support for sysfs is provided in the +"Configuring Bonding Manually via Sysfs" section, below. The available bonding driver parameters are listed below. If a parameter is not specified the default value is used. When initially @@ -554,6 +559,30 @@ xmit_hash_policy This algorithm is 802.3ad compliant. + layer2+3 + + This policy uses a combination of layer2 and layer3 + protocol information to generate the hash. + + Uses XOR of hardware MAC addresses and IP addresses to + generate the hash. The formula is + + (((source IP XOR dest IP) AND 0xffff) XOR + ( source MAC XOR destination MAC )) + modulo slave count + + This algorithm will place all traffic to a particular + network peer on the same slave. For non-IP traffic, + the formula is the same as for the layer2 transmit + hash policy. + + This policy is intended to provide a more balanced + distribution of traffic than layer2 alone, especially + in environments where a layer3 gateway device is + required to reach most destinations. + + This algorithm is 802.3ad complient. + layer3+4 This policy uses upper layer protocol information, @@ -589,8 +618,9 @@ xmit_hash_policy or may not tolerate this noncompliance. The default value is layer2. This option was added in bonding -version 2.6.3. In earlier versions of bonding, this parameter does -not exist, and the layer2 policy is the only policy. + version 2.6.3. In earlier versions of bonding, this parameter + does not exist, and the layer2 policy is the only policy. The + layer2+3 value was added for bonding version 3.2.2. 3. Configuring Bonding Devices @@ -787,11 +817,13 @@ the system /etc/modules.conf or /etc/modprobe.conf configuration file. 3.2 Configuration with Initscripts Support ------------------------------------------ - This section applies to distros using a version of initscripts -with bonding support, for example, Red Hat Linux 9 or Red Hat -Enterprise Linux version 3 or 4. On these systems, the network -initialization scripts have some knowledge of bonding, and can be -configured to control bonding devices. + This section applies to distros using a recent version of +initscripts with bonding support, for example, Red Hat Enterprise Linux +version 3 or later, Fedora, etc. On these systems, the network +initialization scripts have knowledge of bonding, and can be configured to +control bonding devices. Note that older versions of the initscripts +package have lower levels of support for bonding; this will be noted where +applicable. These distros will not automatically load the network adapter driver unless the ethX device is configured with an IP address. @@ -839,11 +871,31 @@ USERCTL=no Be sure to change the networking specific lines (IPADDR, NETMASK, NETWORK and BROADCAST) to match your network configuration. - Finally, it is necessary to edit /etc/modules.conf (or -/etc/modprobe.conf, depending upon your distro) to load the bonding -module with your desired options when the bond0 interface is brought -up. The following lines in /etc/modules.conf (or modprobe.conf) will -load the bonding module, and select its options: + For later versions of initscripts, such as that found with Fedora +7 and Red Hat Enterprise Linux version 5 (or later), it is possible, and, +indeed, preferable, to specify the bonding options in the ifcfg-bond0 +file, e.g. a line of the format: + +BONDING_OPTS="mode=active-backup arp_interval=60 arp_ip_target=+192.168.1.254" + + will configure the bond with the specified options. The options +specified in BONDING_OPTS are identical to the bonding module parameters +except for the arp_ip_target field. Each target should be included as a +separate option and should be preceded by a '+' to indicate it should be +added to the list of queried targets, e.g., + + arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2 + + is the proper syntax to specify multiple targets. When specifying +options via BONDING_OPTS, it is not necessary to edit /etc/modules.conf or +/etc/modprobe.conf. + + For older versions of initscripts that do not support +BONDING_OPTS, it is necessary to edit /etc/modules.conf (or +/etc/modprobe.conf, depending upon your distro) to load the bonding module +with your desired options when the bond0 interface is brought up. The +following lines in /etc/modules.conf (or modprobe.conf) will load the +bonding module, and select its options: alias bond0 bonding options bond0 mode=balance-alb miimon=100 @@ -858,9 +910,10 @@ up and running. 3.2.1 Using DHCP with Initscripts --------------------------------- - Recent versions of initscripts (the version supplied with -Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do -have support for assigning IP information to bonding devices via DHCP. + Recent versions of initscripts (the versions supplied with Fedora +Core 3 and Red Hat Enterprise Linux 4, or later versions, are reported to +work) have support for assigning IP information to bonding devices via +DHCP. To configure bonding for DHCP, configure it as described above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" @@ -870,18 +923,14 @@ is case sensitive. 3.2.2 Configuring Multiple Bonds with Initscripts ------------------------------------------------- - At this writing, the initscripts package does not directly -support loading the bonding driver multiple times, so the process for -doing so is the same as described in the "Configuring Multiple Bonds -Manually" section, below. - - NOTE: It has been observed that some Red Hat supplied kernels -are apparently unable to rename modules at load time (the "-o bond1" -part). Attempts to pass that option to modprobe will produce an -"Operation not permitted" error. This has been reported on some -Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels -exhibiting this problem, it will be impossible to configure multiple -bonds with differing parameters. + Initscripts packages that are included with Fedora 7 and Red Hat +Enterprise Linux 5 support multiple bonding interfaces by simply +specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the +number of the bond. This support requires sysfs support in the kernel, +and a bonding driver of version 3.0.0 or later. Other configurations may +not support this method for specifying multiple bonding interfaces; for +those instances, see the "Configuring Multiple Bonds Manually" section, +below. 3.3 Configuring Bonding Manually with Ifenslave ----------------------------------------------- @@ -952,15 +1001,58 @@ initialization scripts lack support for configuring multiple bonds. options, you may wish to use the "max_bonds" module parameter, documented above. - To create multiple bonding devices with differing options, it -is necessary to use bonding parameters exported by sysfs, documented -in the section below. + To create multiple bonding devices with differing options, it is +preferrable to use bonding parameters exported by sysfs, documented in the +section below. + + For versions of bonding without sysfs support, the only means to +provide multiple instances of bonding with differing options is to load +the bonding driver multiple times. Note that current versions of the +sysconfig network initialization scripts handle this automatically; if +your distro uses these scripts, no special action is needed. See the +section Configuring Bonding Devices, above, if you're not sure about your +network initialization scripts. + + To load multiple instances of the module, it is necessary to +specify a different name for each instance (the module loading system +requires that every loaded module, even multiple instances of the same +module, have a unique name). This is accomplished by supplying multiple +sets of bonding options in /etc/modprobe.conf, for example: + +alias bond0 bonding +options bond0 -o bond0 mode=balance-rr miimon=100 + +alias bond1 bonding +options bond1 -o bond1 mode=balance-alb miimon=50 + + will load the bonding module two times. The first instance is +named "bond0" and creates the bond0 device in balance-rr mode with an +miimon of 100. The second instance is named "bond1" and creates the +bond1 device in balance-alb mode with an miimon of 50. + + In some circumstances (typically with older distributions), +the above does not work, and the second bonding instance never sees +its options. In that case, the second options line can be substituted +as follows: + +install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ + mode=balance-alb miimon=50 + This may be repeated any number of times, specifying a new and +unique name in place of bond1 for each subsequent instance. + + It has been observed that some Red Hat supplied kernels are unable +to rename modules at load time (the "-o bond1" part). Attempts to pass +that option to modprobe will produce an "Operation not permitted" error. +This has been reported on some Fedora Core kernels, and has been seen on +RHEL 4 as well. On kernels exhibiting this problem, it will be impossible +to configure multiple bonds with differing parameters (as they are older +kernels, and also lack sysfs support). 3.4 Configuring Bonding Manually via Sysfs ------------------------------------------ - Starting with version 3.0, Channel Bonding may be configured + Starting with version 3.0.0, Channel Bonding may be configured via the sysfs interface. This interface allows dynamic configuration of all bonds in the system without unloading the module. It also allows for adding and removing bonds at runtime. Ifenslave is no @@ -1005,9 +1097,6 @@ To enslave interface eth0 to bond bond0: To free slave eth0 from bond bond0: # echo -eth0 > /sys/class/net/bond0/bonding/slaves - NOTE: The bond must be up before slaves can be added. All -slaves are freed when the interface is brought down. - When an interface is enslaved to a bond, symlinks between the two are created in the sysfs filesystem. In this case, you would get /sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and @@ -1597,6 +1686,15 @@ one for each switch in the network). This will insure that, regardless of which switch is active, the ARP monitor has a suitable target to query. + Note, also, that of late many switches now support a functionality +generally referred to as "trunk failover." This is a feature of the +switch that causes the link state of a particular switch port to be set +down (or up) when the state of another switch port goes down (or up). +It's purpose is to propogate link failures from logically "exterior" ports +to the logically "interior" ports that bonding is able to monitor via +miimon. Availability and configuration for trunk failover varies by +switch, but this can be a viable alternative to the ARP monitor when using +suitable switches. 12. Configuring Bonding for Maximum Throughput ============================================== @@ -1684,7 +1782,7 @@ balance-rr: This mode is the only mode that will permit a single interfaces. It is therefore the only mode that will allow a single TCP/IP stream to utilize more than one interface's worth of throughput. This comes at a cost, however: the - striping often results in peer systems receiving packets out + striping generally results in peer systems receiving packets out of order, causing TCP/IP's congestion control system to kick in, often by retransmitting segments. @@ -1696,22 +1794,20 @@ balance-rr: This mode is the only mode that will permit a single interface's worth of throughput, even after adjusting tcp_reordering. - Note that this out of order delivery occurs when both the - sending and receiving systems are utilizing a multiple - interface bond. Consider a configuration in which a - balance-rr bond feeds into a single higher capacity network - channel (e.g., multiple 100Mb/sec ethernets feeding a single - gigabit ethernet via an etherchannel capable switch). In this - configuration, traffic sent from the multiple 100Mb devices to - a destination connected to the gigabit device will not see - packets out of order. However, traffic sent from the gigabit - device to the multiple 100Mb devices may or may not see - traffic out of order, depending upon the balance policy of the - switch. Many switches do not support any modes that stripe - traffic (instead choosing a port based upon IP or MAC level - addresses); for those devices, traffic flowing from the - gigabit device to the many 100Mb devices will only utilize one - interface. + Note that the fraction of packets that will be delivered out of + order is highly variable, and is unlikely to be zero. The level + of reordering depends upon a variety of factors, including the + networking interfaces, the switch, and the topology of the + configuration. Speaking in general terms, higher speed network + cards produce more reordering (due to factors such as packet + coalescing), and a "many to many" topology will reorder at a + higher rate than a "many slow to one fast" configuration. + + Many switches do not support any modes that stripe traffic + (instead choosing a port based upon IP or MAC level addresses); + for those devices, traffic for a particular connection flowing + through the switch to a balance-rr bond will not utilize greater + than one interface's worth of bandwidth. If you are utilizing protocols other than TCP/IP, UDP for example, and your application can tolerate out of order @@ -1911,6 +2007,10 @@ Failover may be delayed via the downdelay bonding module option. 13.2 Duplicated Incoming Packets -------------------------------- + NOTE: Starting with version 3.0.2, the bonding driver has logic to +suppress duplicate packets, which should largely eliminate this problem. +The following description is kept for reference. + It is not uncommon to observe a short burst of duplicated traffic when the bonding device is first used, or after it has been idle for some period of time. This is most easily observed by issuing @@ -2071,6 +2171,9 @@ The new driver was designed to be SMP safe from the start. EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, devices need not be of the same speed. + Starting with version 3.2.1, bonding also supports Infiniband +slaves in active-backup mode. + 3. How many bonding devices can I have? There is no limit. @@ -2129,11 +2232,15 @@ switches currently available support 802.3ad. 8. Where does a bonding device get its MAC address from? - If not explicitly configured (with ifconfig or ip link), the -MAC address of the bonding device is taken from its first slave -device. This MAC address is then passed to all following slaves and -remains persistent (even if the first slave is removed) until the -bonding device is brought down or reconfigured. + When using slave devices that have fixed MAC addresses, or when +the fail_over_mac option is enabled, the bonding device's MAC address is +the MAC address of the active slave. + + For other configurations, if not explicitly configured (with +ifconfig or ip link), the MAC address of the bonding device is taken from +its first slave device. This MAC address is then passed to all following +slaves and remains persistent (even if the first slave is removed) until +the bonding device is brought down or reconfigured. If you wish to change the MAC address, you can set it with ifconfig or ip link: diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt new file mode 100644 index 000000000000..f1b2de170929 --- /dev/null +++ b/Documentation/networking/can.txt @@ -0,0 +1,629 @@ +============================================================================ + +can.txt + +Readme file for the Controller Area Network Protocol Family (aka Socket CAN) + +This file contains + + 1 Overview / What is Socket CAN + + 2 Motivation / Why using the socket API + + 3 Socket CAN concept + 3.1 receive lists + 3.2 local loopback of sent frames + 3.3 network security issues (capabilities) + 3.4 network problem notifications + + 4 How to use Socket CAN + 4.1 RAW protocol sockets with can_filters (SOCK_RAW) + 4.1.1 RAW socket option CAN_RAW_FILTER + 4.1.2 RAW socket option CAN_RAW_ERR_FILTER + 4.1.3 RAW socket option CAN_RAW_LOOPBACK + 4.1.4 RAW socket option CAN_RAW_RECV_OWN_MSGS + 4.2 Broadcast Manager protocol sockets (SOCK_DGRAM) + 4.3 connected transport protocols (SOCK_SEQPACKET) + 4.4 unconnected transport protocols (SOCK_DGRAM) + + 5 Socket CAN core module + 5.1 can.ko module params + 5.2 procfs content + 5.3 writing own CAN protocol modules + + 6 CAN network drivers + 6.1 general settings + 6.2 local loopback of sent frames + 6.3 CAN controller hardware filters + 6.4 currently supported CAN hardware + 6.5 todo + + 7 Credits + +============================================================================ + +1. Overview / What is Socket CAN +-------------------------------- + +The socketcan package is an implementation of CAN protocols +(Controller Area Network) for Linux. CAN is a networking technology +which has widespread use in automation, embedded devices, and +automotive fields. While there have been other CAN implementations +for Linux based on character devices, Socket CAN uses the Berkeley +socket API, the Linux network stack and implements the CAN device +drivers as network interfaces. The CAN socket API has been designed +as similar as possible to the TCP/IP protocols to allow programmers, +familiar with network programming, to easily learn how to use CAN +sockets. + +2. Motivation / Why using the socket API +---------------------------------------- + +There have been CAN implementations for Linux before Socket CAN so the +question arises, why we have started another project. Most existing +implementations come as a device driver for some CAN hardware, they +are based on character devices and provide comparatively little +functionality. Usually, there is only a hardware-specific device +driver which provides a character device interface to send and +receive raw CAN frames, directly to/from the controller hardware. +Queueing of frames and higher-level transport protocols like ISO-TP +have to be implemented in user space applications. Also, most +character-device implementations support only one single process to +open the device at a time, similar to a serial interface. Exchanging +the CAN controller requires employment of another device driver and +often the need for adaption of large parts of the application to the +new driver's API. + +Socket CAN was designed to overcome all of these limitations. A new +protocol family has been implemented which provides a socket interface +to user space applications and which builds upon the Linux network +layer, so to use all of the provided queueing functionality. A device +driver for CAN controller hardware registers itself with the Linux +network layer as a network device, so that CAN frames from the +controller can be passed up to the network layer and on to the CAN +protocol family module and also vice-versa. Also, the protocol family +module provides an API for transport protocol modules to register, so +that any number of transport protocols can be loaded or unloaded +dynamically. In fact, the can core module alone does not provide any +protocol and cannot be used without loading at least one additional +protocol module. Multiple sockets can be opened at the same time, +on different or the same protocol module and they can listen/send +frames on different or the same CAN IDs. Several sockets listening on +the same interface for frames with the same CAN ID are all passed the +same received matching CAN frames. An application wishing to +communicate using a specific transport protocol, e.g. ISO-TP, just +selects that protocol when opening the socket, and then can read and +write application data byte streams, without having to deal with +CAN-IDs, frames, etc. + +Similar functionality visible from user-space could be provided by a +character device, too, but this would lead to a technically inelegant +solution for a couple of reasons: + +* Intricate usage. Instead of passing a protocol argument to + socket(2) and using bind(2) to select a CAN interface and CAN ID, an + application would have to do all these operations using ioctl(2)s. + +* Code duplication. A character device cannot make use of the Linux + network queueing code, so all that code would have to be duplicated + for CAN networking. + +* Abstraction. In most existing character-device implementations, the + hardware-specific device driver for a CAN controller directly + provides the character device for the application to work with. + This is at least very unusual in Unix systems for both, char and + block devices. For example you don't have a character device for a + certain UART of a serial interface, a certain sound chip in your + computer, a SCSI or IDE controller providing access to your hard + disk or tape streamer device. Instead, you have abstraction layers + which provide a unified character or block device interface to the + application on the one hand, and a interface for hardware-specific + device drivers on the other hand. These abstractions are provided + by subsystems like the tty layer, the audio subsystem or the SCSI + and IDE subsystems for the devices mentioned above. + + The easiest way to implement a CAN device driver is as a character + device without such a (complete) abstraction layer, as is done by most + existing drivers. The right way, however, would be to add such a + layer with all the functionality like registering for certain CAN + IDs, supporting several open file descriptors and (de)multiplexing + CAN frames between them, (sophisticated) queueing of CAN frames, and + providing an API for device drivers to register with. However, then + it would be no more difficult, or may be even easier, to use the + networking framework provided by the Linux kernel, and this is what + Socket CAN does. + + The use of the networking framework of the Linux kernel is just the + natural and most appropriate way to implement CAN for Linux. + +3. Socket CAN concept +--------------------- + + As described in chapter 2 it is the main goal of Socket CAN to + provide a socket interface to user space applications which builds + upon the Linux network layer. In contrast to the commonly known + TCP/IP and ethernet networking, the CAN bus is a broadcast-only(!) + medium that has no MAC-layer addressing like ethernet. The CAN-identifier + (can_id) is used for arbitration on the CAN-bus. Therefore the CAN-IDs + have to be chosen uniquely on the bus. When designing a CAN-ECU + network the CAN-IDs are mapped to be sent by a specific ECU. + For this reason a CAN-ID can be treated best as a kind of source address. + + 3.1 receive lists + + The network transparent access of multiple applications leads to the + problem that different applications may be interested in the same + CAN-IDs from the same CAN network interface. The Socket CAN core + module - which implements the protocol family CAN - provides several + high efficient receive lists for this reason. If e.g. a user space + application opens a CAN RAW socket, the raw protocol module itself + requests the (range of) CAN-IDs from the Socket CAN core that are + requested by the user. The subscription and unsubscription of + CAN-IDs can be done for specific CAN interfaces or for all(!) known + CAN interfaces with the can_rx_(un)register() functions provided to + CAN protocol modules by the SocketCAN core (see chapter 5). + To optimize the CPU usage at runtime the receive lists are split up + into several specific lists per device that match the requested + filter complexity for a given use-case. + + 3.2 local loopback of sent frames + + As known from other networking concepts the data exchanging + applications may run on the same or different nodes without any + change (except for the according addressing information): + + ___ ___ ___ _______ ___ + | _ | | _ | | _ | | _ _ | | _ | + ||A|| ||B|| ||C|| ||A| |B|| ||C|| + |___| |___| |___| |_______| |___| + | | | | | + -----------------(1)- CAN bus -(2)--------------- + + To ensure that application A receives the same information in the + example (2) as it would receive in example (1) there is need for + some kind of local loopback of the sent CAN frames on the appropriate + node. + + The Linux network devices (by default) just can handle the + transmission and reception of media dependent frames. Due to the + arbritration on the CAN bus the transmission of a low prio CAN-ID + may be delayed by the reception of a high prio CAN frame. To + reflect the correct* traffic on the node the loopback of the sent + data has to be performed right after a successful transmission. If + the CAN network interface is not capable of performing the loopback for + some reason the SocketCAN core can do this task as a fallback solution. + See chapter 6.2 for details (recommended). + + The loopback functionality is enabled by default to reflect standard + networking behaviour for CAN applications. Due to some requests from + the RT-SocketCAN group the loopback optionally may be disabled for each + separate socket. See sockopts from the CAN RAW sockets in chapter 4.1. + + * = you really like to have this when you're running analyser tools + like 'candump' or 'cansniffer' on the (same) node. + + 3.3 network security issues (capabilities) + + The Controller Area Network is a local field bus transmitting only + broadcast messages without any routing and security concepts. + In the majority of cases the user application has to deal with + raw CAN frames. Therefore it might be reasonable NOT to restrict + the CAN access only to the user root, as known from other networks. + Since the currently implemented CAN_RAW and CAN_BCM sockets can only + send and receive frames to/from CAN interfaces it does not affect + security of others networks to allow all users to access the CAN. + To enable non-root users to access CAN_RAW and CAN_BCM protocol + sockets the Kconfig options CAN_RAW_USER and/or CAN_BCM_USER may be + selected at kernel compile time. + + 3.4 network problem notifications + + The use of the CAN bus may lead to several problems on the physical + and media access control layer. Detecting and logging of these lower + layer problems is a vital requirement for CAN users to identify + hardware issues on the physical transceiver layer as well as + arbitration problems and error frames caused by the different + ECUs. The occurrence of detected errors are important for diagnosis + and have to be logged together with the exact timestamp. For this + reason the CAN interface driver can generate so called Error Frames + that can optionally be passed to the user application in the same + way as other CAN frames. Whenever an error on the physical layer + or the MAC layer is detected (e.g. by the CAN controller) the driver + creates an appropriate error frame. Error frames can be requested by + the user application using the common CAN filter mechanisms. Inside + this filter definition the (interested) type of errors may be + selected. The reception of error frames is disabled by default. + +4. How to use Socket CAN +------------------------ + + Like TCP/IP, you first need to open a socket for communicating over a + CAN network. Since Socket CAN implements a new protocol family, you + need to pass PF_CAN as the first argument to the socket(2) system + call. Currently, there are two CAN protocols to choose from, the raw + socket protocol and the broadcast manager (BCM). So to open a socket, + you would write + + s = socket(PF_CAN, SOCK_RAW, CAN_RAW); + + and + + s = socket(PF_CAN, SOCK_DGRAM, CAN_BCM); + + respectively. After the successful creation of the socket, you would + normally use the bind(2) system call to bind the socket to a CAN + interface (which is different from TCP/IP due to different addressing + - see chapter 3). After binding (CAN_RAW) or connecting (CAN_BCM) + the socket, you can read(2) and write(2) from/to the socket or use + send(2), sendto(2), sendmsg(2) and the recv* counterpart operations + on the socket as usual. There are also CAN specific socket options + described below. + + The basic CAN frame structure and the sockaddr structure are defined + in include/linux/can.h: + + struct can_frame { + canid_t can_id; /* 32 bit CAN_ID + EFF/RTR/ERR flags */ + __u8 can_dlc; /* data length code: 0 .. 8 */ + __u8 data[8] __attribute__((aligned(8))); + }; + + The alignment of the (linear) payload data[] to a 64bit boundary + allows the user to define own structs and unions to easily access the + CAN payload. There is no given byteorder on the CAN bus by + default. A read(2) system call on a CAN_RAW socket transfers a + struct can_frame to the user space. + + The sockaddr_can structure has an interface index like the + PF_PACKET socket, that also binds to a specific interface: + + struct sockaddr_can { + sa_family_t can_family; + int can_ifindex; + union { + struct { canid_t rx_id, tx_id; } tp16; + struct { canid_t rx_id, tx_id; } tp20; + struct { canid_t rx_id, tx_id; } mcnet; + struct { canid_t rx_id, tx_id; } isotp; + } can_addr; + }; + + To determine the interface index an appropriate ioctl() has to + be used (example for CAN_RAW sockets without error checking): + + int s; + struct sockaddr_can addr; + struct ifreq ifr; + + s = socket(PF_CAN, SOCK_RAW, CAN_RAW); + + strcpy(ifr.ifr_name, "can0" ); + ioctl(s, SIOCGIFINDEX, &ifr); + + addr.can_family = AF_CAN; + addr.can_ifindex = ifr.ifr_ifindex; + + bind(s, (struct sockaddr *)&addr, sizeof(addr)); + + (..) + + To bind a socket to all(!) CAN interfaces the interface index must + be 0 (zero). In this case the socket receives CAN frames from every + enabled CAN interface. To determine the originating CAN interface + the system call recvfrom(2) may be used instead of read(2). To send + on a socket that is bound to 'any' interface sendto(2) is needed to + specify the outgoing interface. + + Reading CAN frames from a bound CAN_RAW socket (see above) consists + of reading a struct can_frame: + + struct can_frame frame; + + nbytes = read(s, &frame, sizeof(struct can_frame)); + + if (nbytes < 0) { + perror("can raw socket read"); + return 1; + } + + /* paraniod check ... */ + if (nbytes < sizeof(struct can_frame)) { + fprintf(stderr, "read: incomplete CAN frame\n"); + return 1; + } + + /* do something with the received CAN frame */ + + Writing CAN frames can be done similarly, with the write(2) system call: + + nbytes = write(s, &frame, sizeof(struct can_frame)); + + When the CAN interface is bound to 'any' existing CAN interface + (addr.can_ifindex = 0) it is recommended to use recvfrom(2) if the + information about the originating CAN interface is needed: + + struct sockaddr_can addr; + struct ifreq ifr; + socklen_t len = sizeof(addr); + struct can_frame frame; + + nbytes = recvfrom(s, &frame, sizeof(struct can_frame), + 0, (struct sockaddr*)&addr, &len); + + /* get interface name of the received CAN frame */ + ifr.ifr_ifindex = addr.can_ifindex; + ioctl(s, SIOCGIFNAME, &ifr); + printf("Received a CAN frame from interface %s", ifr.ifr_name); + + To write CAN frames on sockets bound to 'any' CAN interface the + outgoing interface has to be defined certainly. + + strcpy(ifr.ifr_name, "can0"); + ioctl(s, SIOCGIFINDEX, &ifr); + addr.can_ifindex = ifr.ifr_ifindex; + addr.can_family = AF_CAN; + + nbytes = sendto(s, &frame, sizeof(struct can_frame), + 0, (struct sockaddr*)&addr, sizeof(addr)); + + 4.1 RAW protocol sockets with can_filters (SOCK_RAW) + + Using CAN_RAW sockets is extensively comparable to the commonly + known access to CAN character devices. To meet the new possibilities + provided by the multi user SocketCAN approach, some reasonable + defaults are set at RAW socket binding time: + + - The filters are set to exactly one filter receiving everything + - The socket only receives valid data frames (=> no error frames) + - The loopback of sent CAN frames is enabled (see chapter 3.2) + - The socket does not receive its own sent frames (in loopback mode) + + These default settings may be changed before or after binding the socket. + To use the referenced definitions of the socket options for CAN_RAW + sockets, include <linux/can/raw.h>. + + 4.1.1 RAW socket option CAN_RAW_FILTER + + The reception of CAN frames using CAN_RAW sockets can be controlled + by defining 0 .. n filters with the CAN_RAW_FILTER socket option. + + The CAN filter structure is defined in include/linux/can.h: + + struct can_filter { + canid_t can_id; + canid_t can_mask; + }; + + A filter matches, when + + <received_can_id> & mask == can_id & mask + + which is analogous to known CAN controllers hardware filter semantics. + The filter can be inverted in this semantic, when the CAN_INV_FILTER + bit is set in can_id element of the can_filter structure. In + contrast to CAN controller hardware filters the user may set 0 .. n + receive filters for each open socket separately: + + struct can_filter rfilter[2]; + + rfilter[0].can_id = 0x123; + rfilter[0].can_mask = CAN_SFF_MASK; + rfilter[1].can_id = 0x200; + rfilter[1].can_mask = 0x700; + + setsockopt(s, SOL_CAN_RAW, CAN_RAW_FILTER, &rfilter, sizeof(rfilter)); + + To disable the reception of CAN frames on the selected CAN_RAW socket: + + setsockopt(s, SOL_CAN_RAW, CAN_RAW_FILTER, NULL, 0); + + To set the filters to zero filters is quite obsolete as not read + data causes the raw socket to discard the received CAN frames. But + having this 'send only' use-case we may remove the receive list in the + Kernel to save a little (really a very little!) CPU usage. + + 4.1.2 RAW socket option CAN_RAW_ERR_FILTER + + As described in chapter 3.4 the CAN interface driver can generate so + called Error Frames that can optionally be passed to the user + application in the same way as other CAN frames. The possible + errors are divided into different error classes that may be filtered + using the appropriate error mask. To register for every possible + error condition CAN_ERR_MASK can be used as value for the error mask. + The values for the error mask are defined in linux/can/error.h . + + can_err_mask_t err_mask = ( CAN_ERR_TX_TIMEOUT | CAN_ERR_BUSOFF ); + + setsockopt(s, SOL_CAN_RAW, CAN_RAW_ERR_FILTER, + &err_mask, sizeof(err_mask)); + + 4.1.3 RAW socket option CAN_RAW_LOOPBACK + + To meet multi user needs the local loopback is enabled by default + (see chapter 3.2 for details). But in some embedded use-cases + (e.g. when only one application uses the CAN bus) this loopback + functionality can be disabled (separately for each socket): + + int loopback = 0; /* 0 = disabled, 1 = enabled (default) */ + + setsockopt(s, SOL_CAN_RAW, CAN_RAW_LOOPBACK, &loopback, sizeof(loopback)); + + 4.1.4 RAW socket option CAN_RAW_RECV_OWN_MSGS + + When the local loopback is enabled, all the sent CAN frames are + looped back to the open CAN sockets that registered for the CAN + frames' CAN-ID on this given interface to meet the multi user + needs. The reception of the CAN frames on the same socket that was + sending the CAN frame is assumed to be unwanted and therefore + disabled by default. This default behaviour may be changed on + demand: + + int recv_own_msgs = 1; /* 0 = disabled (default), 1 = enabled */ + + setsockopt(s, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS, + &recv_own_msgs, sizeof(recv_own_msgs)); + + 4.2 Broadcast Manager protocol sockets (SOCK_DGRAM) + 4.3 connected transport protocols (SOCK_SEQPACKET) + 4.4 unconnected transport protocols (SOCK_DGRAM) + + +5. Socket CAN core module +------------------------- + + The Socket CAN core module implements the protocol family + PF_CAN. CAN protocol modules are loaded by the core module at + runtime. The core module provides an interface for CAN protocol + modules to subscribe needed CAN IDs (see chapter 3.1). + + 5.1 can.ko module params + + - stats_timer: To calculate the Socket CAN core statistics + (e.g. current/maximum frames per second) this 1 second timer is + invoked at can.ko module start time by default. This timer can be + disabled by using stattimer=0 on the module comandline. + + - debug: (removed since SocketCAN SVN r546) + + 5.2 procfs content + + As described in chapter 3.1 the Socket CAN core uses several filter + lists to deliver received CAN frames to CAN protocol modules. These + receive lists, their filters and the count of filter matches can be + checked in the appropriate receive list. All entries contain the + device and a protocol module identifier: + + foo@bar:~$ cat /proc/net/can/rcvlist_all + + receive list 'rx_all': + (vcan3: no entry) + (vcan2: no entry) + (vcan1: no entry) + device can_id can_mask function userdata matches ident + vcan0 000 00000000 f88e6370 f6c6f400 0 raw + (any: no entry) + + In this example an application requests any CAN traffic from vcan0. + + rcvlist_all - list for unfiltered entries (no filter operations) + rcvlist_eff - list for single extended frame (EFF) entries + rcvlist_err - list for error frames masks + rcvlist_fil - list for mask/value filters + rcvlist_inv - list for mask/value filters (inverse semantic) + rcvlist_sff - list for single standard frame (SFF) entries + + Additional procfs files in /proc/net/can + + stats - Socket CAN core statistics (rx/tx frames, match ratios, ...) + reset_stats - manual statistic reset + version - prints the Socket CAN core version and the ABI version + + 5.3 writing own CAN protocol modules + + To implement a new protocol in the protocol family PF_CAN a new + protocol has to be defined in include/linux/can.h . + The prototypes and definitions to use the Socket CAN core can be + accessed by including include/linux/can/core.h . + In addition to functions that register the CAN protocol and the + CAN device notifier chain there are functions to subscribe CAN + frames received by CAN interfaces and to send CAN frames: + + can_rx_register - subscribe CAN frames from a specific interface + can_rx_unregister - unsubscribe CAN frames from a specific interface + can_send - transmit a CAN frame (optional with local loopback) + + For details see the kerneldoc documentation in net/can/af_can.c or + the source code of net/can/raw.c or net/can/bcm.c . + +6. CAN network drivers +---------------------- + + Writing a CAN network device driver is much easier than writing a + CAN character device driver. Similar to other known network device + drivers you mainly have to deal with: + + - TX: Put the CAN frame from the socket buffer to the CAN controller. + - RX: Put the CAN frame from the CAN controller to the socket buffer. + + See e.g. at Documentation/networking/netdevices.txt . The differences + for writing CAN network device driver are described below: + + 6.1 general settings + + dev->type = ARPHRD_CAN; /* the netdevice hardware type */ + dev->flags = IFF_NOARP; /* CAN has no arp */ + + dev->mtu = sizeof(struct can_frame); + + The struct can_frame is the payload of each socket buffer in the + protocol family PF_CAN. + + 6.2 local loopback of sent frames + + As described in chapter 3.2 the CAN network device driver should + support a local loopback functionality similar to the local echo + e.g. of tty devices. In this case the driver flag IFF_ECHO has to be + set to prevent the PF_CAN core from locally echoing sent frames + (aka loopback) as fallback solution: + + dev->flags = (IFF_NOARP | IFF_ECHO); + + 6.3 CAN controller hardware filters + + To reduce the interrupt load on deep embedded systems some CAN + controllers support the filtering of CAN IDs or ranges of CAN IDs. + These hardware filter capabilities vary from controller to + controller and have to be identified as not feasible in a multi-user + networking approach. The use of the very controller specific + hardware filters could make sense in a very dedicated use-case, as a + filter on driver level would affect all users in the multi-user + system. The high efficient filter sets inside the PF_CAN core allow + to set different multiple filters for each socket separately. + Therefore the use of hardware filters goes to the category 'handmade + tuning on deep embedded systems'. The author is running a MPC603e + @133MHz with four SJA1000 CAN controllers from 2002 under heavy bus + load without any problems ... + + 6.4 currently supported CAN hardware (September 2007) + + On the project website http://developer.berlios.de/projects/socketcan + there are different drivers available: + + vcan: Virtual CAN interface driver (if no real hardware is available) + sja1000: Philips SJA1000 CAN controller (recommended) + i82527: Intel i82527 CAN controller + mscan: Motorola/Freescale CAN controller (e.g. inside SOC MPC5200) + ccan: CCAN controller core (e.g. inside SOC h7202) + slcan: For a bunch of CAN adaptors that are attached via a + serial line ASCII protocol (for serial / USB adaptors) + + Additionally the different CAN adaptors (ISA/PCI/PCMCIA/USB/Parport) + from PEAK Systemtechnik support the CAN netdevice driver model + since Linux driver v6.0: http://www.peak-system.com/linux/index.htm + + Please check the Mailing Lists on the berlios OSS project website. + + 6.5 todo (September 2007) + + The configuration interface for CAN network drivers is still an open + issue that has not been finalized in the socketcan project. Also the + idea of having a library module (candev.ko) that holds functions + that are needed by all CAN netdevices is not ready to ship. + Your contribution is welcome. + +7. Credits +---------- + + Oliver Hartkopp (PF_CAN core, filters, drivers, bcm) + Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan) + Jan Kizka (RT-SocketCAN core, Socket-API reconciliation) + Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews) + Robert Schwebel (design reviews, PTXdist integration) + Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers) + Benedikt Spranger (reviews) + Thomas Gleixner (LKML reviews, coding style, posting hints) + Andrey Volkov (kernel subtree structure, ioctls, mscan driver) + Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003) + Klaus Hitschler (PEAK driver integration) + Uwe Koppe (CAN netdevices with PF_PACKET approach) + Michael Schulze (driver layer loopback requirement, RT CAN drivers review) diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt index afb66f9a8aff..39131a3c78f8 100644 --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -14,24 +14,35 @@ Introduction ============ Datagram Congestion Control Protocol (DCCP) is an unreliable, connection -based protocol designed to solve issues present in UDP and TCP particularly -for real time and multimedia traffic. +oriented protocol designed to solve issues present in UDP and TCP, particularly +for real-time and multimedia (streaming) traffic. +It divides into a base protocol (RFC 4340) and plugable congestion control +modules called CCIDs. Like plugable TCP congestion control, at least one CCID +needs to be enabled in order for the protocol to function properly. In the Linux +implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as +the TCP-friendly CCID3 (RFC 4342), are optional. +For a brief introduction to CCIDs and suggestions for choosing a CCID to match +given applications, see section 10 of RFC 4340. It has a base protocol and pluggable congestion control IDs (CCIDs). -It is at proposed standard RFC status and the homepage for DCCP as a protocol -is at: - http://www.read.cs.ucla.edu/dccp/ +DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol +is at http://www.ietf.org/html.charters/dccp-charter.html Missing features ================ -The DCCP implementation does not currently have all the features that are in -the RFC. +The Linux DCCP implementation does not currently support all the features that are +specified in RFCs 4340...42. The known bugs are at: http://linux-net.osdl.org/index.php/TODO#DCCP +For more up-to-date versions of the DCCP implementation, please consider using +the experimental DCCP test tree; instructions for checking this out are on: +http://linux-net.osdl.org/index.php/DCCP_Testing#Experimental_DCCP_source_tree + + Socket options ============== @@ -46,6 +57,12 @@ can be set before calling bind(). DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet size (application payload size) in bytes, see RFC 4340, section 14. +DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold +timewait state when closing the connection (RFC 4340, 8.3). The usual case is +that the closing server sends a CloseReq, whereupon the client holds timewait +state. When this boolean socket option is on, the server sends a Close instead +and will enter TIMEWAIT. This option must be set after accept() returns. + DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums always cover the entire packet and that only fully covered application data is @@ -72,6 +89,8 @@ DCCP_SOCKOPT_CCID_TX_INFO Returns a `struct tfrc_tx_info' in optval; the buffer for optval and optlen must be set to at least sizeof(struct tfrc_tx_info). +On unidirectional connections it is useful to close the unused half-connection +via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs. Sysctl variables ================ @@ -123,6 +142,12 @@ sync_ratelimit = 125 ms sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit of this parameter is milliseconds; a value of 0 disables rate-limiting. +IOCTLS +====== +FIONREAD + Works as in udp(7): returns in the `int' argument pointer the size of + the next pending datagram in bytes, or 0 when no datagram is pending. + Notes ===== diff --git a/Documentation/networking/decnet.txt b/Documentation/networking/decnet.txt index badb7480ea62..d8968958d839 100644 --- a/Documentation/networking/decnet.txt +++ b/Documentation/networking/decnet.txt @@ -60,7 +60,7 @@ operation of the local communications in any other way though. The kernel command line takes options looking like the following: - decnet=1,2 + decnet.addr=1,2 the two numbers are the node address 1,2 = 1.2 For 2.2.xx kernels and early 2.3.xx kernels, you must use a comma when specifying the diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt index 4f7da5a2bf4f..ea72d2e66ca8 100644 --- a/Documentation/networking/driver.txt +++ b/Documentation/networking/driver.txt @@ -61,7 +61,10 @@ Transmit path guidelines: 2) Do not forget to update netdev->trans_start to jiffies after each new tx packet is given to the hardware. -3) Do not forget that once you return 0 from your hard_start_xmit +3) A hard_start_xmit method must not modify the shared parts of a + cloned SKB. + +4) Do not forget that once you return 0 from your hard_start_xmit method, it is your driver's responsibility to free up the SKB and in some finite amount of time. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 6f7872ba1def..17a6e46fbd43 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -446,6 +446,33 @@ tcp_dma_copybreak - INTEGER and CONFIG_NET_DMA is enabled. Default: 4096 +UDP variables: + +udp_mem - vector of 3 INTEGERs: min, pressure, max + Number of pages allowed for queueing by all UDP sockets. + + min: Below this number of pages UDP is not bothered about its + memory appetite. When amount of memory allocated by UDP exceeds + this number, UDP starts to moderate memory usage. + + pressure: This value was introduced to follow format of tcp_mem. + + max: Number of pages allowed for queueing by all UDP sockets. + + Default is calculated at boot time from amount of available memory. + +udp_rmem_min - INTEGER + Minimal size of receive buffer used by UDP sockets in moderation. + Each UDP socket is able to use the size for receiving data, even if + total pages of UDP sockets exceed udp_mem pressure. The unit is byte. + Default: 4096 + +udp_wmem_min - INTEGER + Minimal size of send buffer used by UDP sockets in moderation. + Each UDP socket is able to use the size for sending data, even if + total pages of UDP sockets exceed udp_mem pressure. The unit is byte. + Default: 4096 + CIPSOv4 Variables: cipso_cache_enable - BOOLEAN diff --git a/Documentation/networking/shaper.txt b/Documentation/networking/shaper.txt deleted file mode 100644 index 6c4ebb66a906..000000000000 --- a/Documentation/networking/shaper.txt +++ /dev/null @@ -1,48 +0,0 @@ -Traffic Shaper For Linux - -This is the current BETA release of the traffic shaper for Linux. It works -within the following limits: - -o Minimum shaping speed is currently about 9600 baud (it can only -shape down to 1 byte per clock tick) - -o Maximum is about 256K, it will go above this but get a bit blocky. - -o If you ifconfig the master device that a shaper is attached to down -then your machine will follow. - -o The shaper must be a module. - - -Setup: - - A shaper device is configured using the shapeconfig program. -Typically you will do something like this - -shapecfg attach shaper0 eth1 -shapecfg speed shaper0 64000 -ifconfig shaper0 myhost netmask 255.255.255.240 broadcast 1.2.3.4.255 up -route add -net some.network netmask a.b.c.d dev shaper0 - -The shaper should have the same IP address as the device it is attached to -for normal use. - -Gotchas: - - The shaper shapes transmitted traffic. It's rather impossible to -shape received traffic except at the end (or a router) transmitting it. - - Gated/routed/rwhod/mrouted all see the shaper as an additional device -and will treat it as such unless patched. Note that for mrouted you can run -mrouted tunnels via a traffic shaper to control bandwidth usage. - - The shaper is device/route based. This makes it very easy to use -with any setup BUT less flexible. You may need to use iproute2 to set up -multiple route tables to get the flexibility. - - There is no "borrowing" or "sharing" scheme. This is a simple -traffic limiter. We implement Van Jacobson and Sally Floyd's CBQ -architecture into Linux 2.2. This is the preferred solution. Shaper is -for simple or back compatible setups. - -Alan diff --git a/Documentation/networking/udplite.txt b/Documentation/networking/udplite.txt index b6409cab075c..3870f280280b 100644 --- a/Documentation/networking/udplite.txt +++ b/Documentation/networking/udplite.txt @@ -236,7 +236,7 @@ This displays UDP-Lite statistics variables, whose meaning is as follows. - InDatagrams: Total number of received datagrams. + InDatagrams: The total number of datagrams delivered to users. NoPorts: Number of packets received to an unknown port. These cases are counted separately (not as InErrors). diff --git a/Documentation/networking/wavelan.txt b/Documentation/networking/wavelan.txt index c1acf5eb3712..afa6e521c685 100644 --- a/Documentation/networking/wavelan.txt +++ b/Documentation/networking/wavelan.txt @@ -12,8 +12,8 @@ and many Linux driver to support it. "wavelan" driver (old ISA Wavelan) ---------------- o Config : Network device -> Wireless LAN -> AT&T WaveLAN - o Location : .../drivers/net/wavelan* - o in-line doc : .../drivers/net/wavelan.p.h + o Location : .../drivers/net/wireless/wavelan* + o in-line doc : .../drivers/net/wireless/wavelan.p.h o on-line doc : http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/Wavelan.html diff --git a/Documentation/networking/xfrm_proc.txt b/Documentation/networking/xfrm_proc.txt new file mode 100644 index 000000000000..d0d8bafa9016 --- /dev/null +++ b/Documentation/networking/xfrm_proc.txt @@ -0,0 +1,74 @@ +XFRM proc - /proc/net/xfrm_* files +================================== +Masahide NAKAMURA <nakam@linux-ipv6.org> + + +Transformation Statistics +------------------------- +xfrm_proc is a statistics shown factor dropped by transformation +for developer. +It is a counter designed from current transformation source code +and defined like linux private MIB. + +Inbound statistics +~~~~~~~~~~~~~~~~~~ +XfrmInError: + All errors which is not matched others +XfrmInBufferError: + No buffer is left +XfrmInHdrError: + Header error +XfrmInNoStates: + No state is found + i.e. Either inbound SPI, address, or IPsec protocol at SA is wrong +XfrmInStateProtoError: + Transformation protocol specific error + e.g. SA key is wrong +XfrmInStateModeError: + Transformation mode specific error +XfrmInStateSeqError: + Sequence error + i.e. Sequence number is out of window +XfrmInStateExpired: + State is expired +XfrmInStateMismatch: + State has mismatch option + e.g. UDP encapsulation type is mismatch +XfrmInStateInvalid: + State is invalid +XfrmInTmplMismatch: + No matching template for states + e.g. Inbound SAs are correct but SP rule is wrong +XfrmInNoPols: + No policy is found for states + e.g. Inbound SAs are correct but no SP is found +XfrmInPolBlock: + Policy discards +XfrmInPolError: + Policy error + +Outbound errors +~~~~~~~~~~~~~~~ +XfrmOutError: + All errors which is not matched others +XfrmOutBundleGenError: + Bundle generation error +XfrmOutBundleCheckError: + Bundle check error +XfrmOutNoStates: + No state is found +XfrmOutStateProtoError: + Transformation protocol specific error +XfrmOutStateModeError: + Transformation mode specific error +XfrmOutStateSeqError: + Sequence error + i.e. Sequence number overflow +XfrmOutStateExpired: + State is expired +XfrmOutPolBlock: + Policy discards +XfrmOutPolDead: + Policy is dead +XfrmOutPolError: + Policy error diff --git a/Documentation/nfsroot.txt b/Documentation/nfsroot.txt index 16a7cae2721d..31b329172343 100644 --- a/Documentation/nfsroot.txt +++ b/Documentation/nfsroot.txt @@ -92,8 +92,10 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> autoconfiguration. The <autoconf> parameter can appear alone as the value to the `ip' - parameter (without all the ':' characters before) in which case auto- - configuration is used. + parameter (without all the ':' characters before). If the value is + "ip=off" or "ip=none", no autoconfiguration will take place, otherwise + autoconfiguration will take place. The most common way to use this + is "ip=dhcp". <client-ip> IP address of the client. @@ -142,8 +144,10 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> into the kernel will be used, regardless of the value of this option. - off or none: don't use autoconfiguration (default) + off or none: don't use autoconfiguration + (do static IP assignment instead) on or any: use any protocol available in the kernel + (default) dhcp: use DHCP bootp: use BOOTP rarp: use RARP diff --git a/Documentation/parport-lowlevel.txt b/Documentation/parport-lowlevel.txt index 265fcdcb8e5f..120eb20dbb09 100644 --- a/Documentation/parport-lowlevel.txt +++ b/Documentation/parport-lowlevel.txt @@ -339,6 +339,10 @@ Use this function to register your device driver on a parallel port ('port'). Once you have done that, you will be able to use parport_claim and parport_release in order to use the port. +The ('name') argument is the name of the device that appears in /proc +filesystem. The string must be valid for the whole lifetime of the +device (until parport_unregister_device is called). + This function will register three callbacks into your driver: 'preempt', 'wakeup' and 'irq'. Each of these may be NULL in order to indicate that you do not want a callback. diff --git a/Documentation/pci.txt b/Documentation/pci.txt index 7754f5aea4e9..72b20c639596 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt @@ -274,8 +274,6 @@ the PCI device by calling pci_enable_device(). This will: o allocate an IRQ (if BIOS did not). NOTE: pci_enable_device() can fail! Check the return value. -NOTE2: Also see pci_enable_device_bars() below. Drivers can - attempt to enable only a subset of BARs they need. [ OS BUG: we don't check resource allocations before enabling those resources. The sequence would make more sense if we called @@ -605,40 +603,7 @@ device lists. This is still possible but discouraged. -10. pci_enable_device_bars() and Legacy I/O Port space -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Large servers may not be able to provide I/O port resources to all PCI -devices. I/O Port space is only 64KB on Intel Architecture[1] and is -likely also fragmented since the I/O base register of PCI-to-PCI -bridge will usually be aligned to a 4KB boundary[2]. On such systems, -pci_enable_device() and pci_request_region() will fail when -attempting to enable I/O Port regions that don't have I/O Port -resources assigned. - -Fortunately, many PCI devices which request I/O Port resources also -provide access to the same registers via MMIO BARs. These devices can -be handled without using I/O port space and the drivers typically -offer a CONFIG_ option to only use MMIO regions -(e.g. CONFIG_TULIP_MMIO). PCI devices typically provide I/O port -interface for legacy OSes and will work when I/O port resources are not -assigned. The "PCI Local Bus Specification Revision 3.0" discusses -this on p.44, "IMPLEMENTATION NOTE". - -If your PCI device driver doesn't need I/O port resources assigned to -I/O Port BARs, you should use pci_enable_device_bars() instead of -pci_enable_device() in order not to enable I/O port regions for the -corresponding devices. In addition, you should use -pci_request_selected_regions() and pci_release_selected_regions() -instead of pci_request_regions()/pci_release_regions() in order not to -request/release I/O port regions for the corresponding devices. - -[1] Some systems support 64KB I/O port space per PCI segment. -[2] Some PCI-to-PCI bridges support optional 1KB aligned I/O base. - - - -11. MMIO Space and "Write Posting" +10. MMIO Space and "Write Posting" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Converting a driver from using I/O Port space to using MMIO space diff --git a/Documentation/pcmcia/driver-changes.txt b/Documentation/pcmcia/driver-changes.txt index 4739c5c3face..96f155e68750 100644 --- a/Documentation/pcmcia/driver-changes.txt +++ b/Documentation/pcmcia/driver-changes.txt @@ -33,8 +33,8 @@ This file details changes in 2.6 which affect PCMCIA card driver authors: and can be used (e.g. for SET_NETDEV_DEV) by using handle_to_dev(client_handle_t * handle). -* Convert internal I/O port addresses to unsigned long (as of 2.6.11) - ioaddr_t should be replaced by kio_addr_t in PCMCIA card drivers. +* Convert internal I/O port addresses to unsigned int (as of 2.6.11) + ioaddr_t should be replaced by unsigned int in PCMCIA card drivers. * irq_mask and irq_list parameters (as of 2.6.11) The irq_mask and irq_list parameters should no longer be used in diff --git a/Documentation/pm_qos_interface.txt b/Documentation/pm_qos_interface.txt new file mode 100644 index 000000000000..49adb1a33514 --- /dev/null +++ b/Documentation/pm_qos_interface.txt @@ -0,0 +1,59 @@ +PM quality of Service interface. + +This interface provides a kernel and user mode interface for registering +performance expectations by drivers, subsystems and user space applications on +one of the parameters. + +Currently we have {cpu_dma_latency, network_latency, network_throughput} as the +initial set of pm_qos parameters. + +The infrastructure exposes multiple misc device nodes one per implemented +parameter. The set of parameters implement is defined by pm_qos_power_init() +and pm_qos_params.h. This is done because having the available parameters +being runtime configurable or changeable from a driver was seen as too easy to +abuse. + +For each parameter a list of performance requirements is maintained along with +an aggregated target value. The aggregated target value is updated with +changes to the requirement list or elements of the list. Typically the +aggregated target value is simply the max or min of the requirement values held +in the parameter list elements. + +From kernel mode the use of this interface is simple: +pm_qos_add_requirement(param_id, name, target_value): +Will insert a named element in the list for that identified PM_QOS parameter +with the target value. Upon change to this list the new target is recomputed +and any registered notifiers are called only if the target value is now +different. + +pm_qos_update_requirement(param_id, name, new_target_value): +Will search the list identified by the param_id for the named list element and +then update its target value, calling the notification tree if the aggregated +target is changed. with that name is already registered. + +pm_qos_remove_requirement(param_id, name): +Will search the identified list for the named element and remove it, after +removal it will update the aggregate target and call the notification tree if +the target was changed as a result of removing the named requirement. + + +From user mode: +Only processes can register a pm_qos requirement. To provide for automatic +cleanup for process the interface requires the process to register its +parameter requirements in the following way: + +To register the default pm_qos target for the specific parameter, the process +must open one of /dev/[cpu_dma_latency, network_latency, network_throughput] + +As long as the device node is held open that process has a registered +requirement on the parameter. The name of the requirement is "process_<PID>" +derived from the current->pid from within the open system call. + +To change the requested target value the process needs to write a s32 value to +the open device node. This translates to a pm_qos_update_requirement call. + +To remove the user mode request for a target value simply close the device +node. + + + diff --git a/Documentation/pnp.txt b/Documentation/pnp.txt index 481faf515d53..a327db67782a 100644 --- a/Documentation/pnp.txt +++ b/Documentation/pnp.txt @@ -17,9 +17,9 @@ The User Interface ------------------ The Linux Plug and Play user interface provides a means to activate PnP devices for legacy and user level drivers that do not support Linux Plug and Play. The -user interface is integrated into driverfs. +user interface is integrated into sysfs. -In addition to the standard driverfs file the following are created in each +In addition to the standard sysfs file the following are created in each device's directory: id - displays a list of support EISA IDs options - displays possible resource configurations diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt index 57aef2f6e0de..1555001bc733 100644 --- a/Documentation/power/basic-pm-debugging.txt +++ b/Documentation/power/basic-pm-debugging.txt @@ -1,45 +1,111 @@ -Debugging suspend and resume +Debugging hibernation and suspend (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL -1. Testing suspend to disk (STD) +1. Testing hibernation (aka suspend to disk or STD) -To verify that the STD works, you can try to suspend in the "reboot" mode: +To check if hibernation works, you can try to hibernate in the "reboot" mode: # echo reboot > /sys/power/disk # echo disk > /sys/power/state -and the system should suspend, reboot, resume and get back to the command prompt -where you have started the transition. If that happens, the STD is most likely -to work correctly, but you need to repeat the test at least a couple of times in -a row for confidence. This is necessary, because some problems only show up on -a second attempt at suspending and resuming the system. You should also test -the "platform" and "shutdown" modes of suspend: +and the system should create a hibernation image, reboot, resume and get back to +the command prompt where you have started the transition. If that happens, +hibernation is most likely to work correctly. Still, you need to repeat the +test at least a couple of times in a row for confidence. [This is necessary, +because some problems only show up on a second attempt at suspending and +resuming the system.] Moreover, hibernating in the "reboot" and "shutdown" +modes causes the PM core to skip some platform-related callbacks which on ACPI +systems might be necessary to make hibernation work. Thus, if you machine fails +to hibernate or resume in the "reboot" mode, you should try the "platform" mode: # echo platform > /sys/power/disk # echo disk > /sys/power/state -or +which is the default and recommended mode of hibernation. + +Unfortunately, the "platform" mode of hibernation does not work on some systems +with broken BIOSes. In such cases the "shutdown" mode of hibernation might +work: # echo shutdown > /sys/power/disk # echo disk > /sys/power/state -in which cases you will have to press the power button to make the system -resume. If that does not work, you will need to identify what goes wrong. +(it is similar to the "reboot" mode, but it requires you to press the power +button to make the system resume). + +If neither "platform" nor "shutdown" hibernation mode works, you will need to +identify what goes wrong. + +a) Test modes of hibernation + +To find out why hibernation fails on your system, you can use a special testing +facility available if the kernel is compiled with CONFIG_PM_DEBUG set. Then, +there is the file /sys/power/pm_test that can be used to make the hibernation +core run in a test mode. There are 5 test modes available: + +freezer +- test the freezing of processes + +devices +- test the freezing of processes and suspending of devices -a) Test mode of STD +platform +- test the freezing of processes, suspending of devices and platform + global control methods(*) -To verify if there are any drivers that cause problems you can run the STD -in the test mode: +processors +- test the freezing of processes, suspending of devices, platform + global control methods(*) and the disabling of nonboot CPUs -# echo test > /sys/power/disk +core +- test the freezing of processes, suspending of devices, platform global + control methods(*), the disabling of nonboot CPUs and suspending of + platform/system devices + +(*) the platform global control methods are only available on ACPI systems + and are only tested if the hibernation mode is set to "platform" + +To use one of them it is necessary to write the corresponding string to +/sys/power/pm_test (eg. "devices" to test the freezing of processes and +suspending devices) and issue the standard hibernation commands. For example, +to use the "devices" test mode along with the "platform" mode of hibernation, +you should do the following: + +# echo devices > /sys/power/pm_test +# echo platform > /sys/power/disk # echo disk > /sys/power/state -in which case the system should freeze tasks, suspend devices, disable nonboot -CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw -tasks and return to your command prompt. If that fails, most likely there is -a driver that fails to either suspend or resume (in the latter case the system -may hang or be unstable after the test, so please take that into consideration). -To find this driver, you can carry out a binary search according to the rules: +Then, the kernel will try to freeze processes, suspend devices, wait 5 seconds, +resume devices and thaw processes. If "platform" is written to +/sys/power/pm_test , then after suspending devices the kernel will additionally +invoke the global control methods (eg. ACPI global control methods) used to +prepare the platform firmware for hibernation. Next, it will wait 5 seconds and +invoke the platform (eg. ACPI) global methods used to cancel hibernation etc. + +Writing "none" to /sys/power/pm_test causes the kernel to switch to the normal +hibernation/suspend operations. Also, when open for reading, /sys/power/pm_test +contains a space-separated list of all available tests (including "none" that +represents the normal functionality) in which the current test level is +indicated by square brackets. + +Generally, as you can see, each test level is more "invasive" than the previous +one and the "core" level tests the hardware and drivers as deeply as possible +without creating a hibernation image. Obviously, if the "devices" test fails, +the "platform" test will fail as well and so on. Thus, as a rule of thumb, you +should try the test modes starting from "freezer", through "devices", "platform" +and "processors" up to "core" (repeat the test on each level a couple of times +to make sure that any random factors are avoided). + +If the "freezer" test fails, there is a task that cannot be frozen (in that case +it usually is possible to identify the offending task by analysing the output of +dmesg obtained after the failing test). Failure at this level usually means +that there is a problem with the tasks freezer subsystem that should be +reported. + +If the "devices" test fails, most likely there is a driver that cannot suspend +or resume its device (in the latter case the system may hang or become unstable +after the test, so please take that into consideration). To find this driver, +you can carry out a binary search according to the rules: - if the test fails, unload a half of the drivers currently loaded and repeat (that would probably involve rebooting the system, so always note what drivers have been loaded before the test), @@ -47,23 +113,46 @@ have been loaded before the test), recently and repeat. Once you have found the failing driver (there can be more than just one of -them), you have to unload it every time before the STD transition. In that case -please make sure to report the problem with the driver. - -It is also possible that a cycle can still fail after you have unloaded -all modules. In that case, you would want to look in your kernel configuration -for the drivers that can be compiled as modules (testing again with them as -modules), and possibly also try boot time options such as "noapic" or "noacpi". +them), you have to unload it every time before hibernation. In that case please +make sure to report the problem with the driver. + +It is also possible that the "devices" test will still fail after you have +unloaded all modules. In that case, you may want to look in your kernel +configuration for the drivers that can be compiled as modules (and test again +with these drivers compiled as modules). You may also try to use some special +kernel command line options such as "noapic", "noacpi" or even "acpi=off". + +If the "platform" test fails, there is a problem with the handling of the +platform (eg. ACPI) firmware on your system. In that case the "platform" mode +of hibernation is not likely to work. You can try the "shutdown" mode, but that +is rather a poor man's workaround. + +If the "processors" test fails, the disabling/enabling of nonboot CPUs does not +work (of course, this only may be an issue on SMP systems) and the problem +should be reported. In that case you can also try to switch the nonboot CPUs +off and on using the /sys/devices/system/cpu/cpu*/online sysfs attributes and +see if that works. + +If the "core" test fails, which means that suspending of the system/platform +devices has failed (these devices are suspended on one CPU with interrupts off), +the problem is most probably hardware-related and serious, so it should be +reported. + +A failure of any of the "platform", "processors" or "core" tests may cause your +system to hang or become unstable, so please beware. Such a failure usually +indicates a serious problem that very well may be related to the hardware, but +please report it anyway. b) Testing minimal configuration -If the test mode of STD works, you can boot the system with "init=/bin/bash" -and attempt to suspend in the "reboot", "shutdown" and "platform" modes. If -that does not work, there probably is a problem with a driver statically -compiled into the kernel and you can try to compile more drivers as modules, -so that they can be tested individually. Otherwise, there is a problem with a -modular driver and you can find it by loading a half of the modules you normally -use and binary searching in accordance with the algorithm: +If all of the hibernation test modes work, you can boot the system with the +"init=/bin/bash" command line parameter and attempt to hibernate in the +"reboot", "shutdown" and "platform" modes. If that does not work, there +probably is a problem with a driver statically compiled into the kernel and you +can try to compile more drivers as modules, so that they can be tested +individually. Otherwise, there is a problem with a modular driver and you can +find it by loading a half of the modules you normally use and binary searching +in accordance with the algorithm: - if there are n modules loaded and the attempt to suspend and resume fails, unload n/2 of the modules and try again (that would probably involve rebooting the system), @@ -71,19 +160,19 @@ the system), load n/2 modules more and try again. Again, if you find the offending module(s), it(they) must be unloaded every time -before the STD transition, and please report the problem with it(them). +before hibernation, and please report the problem with it(them). c) Advanced debugging -In case the STD does not work on your system even in the minimal configuration -and compiling more drivers as modules is not practical or some modules cannot -be unloaded, you can use one of the more advanced debugging techniques to find -the problem. First, if there is a serial port in your box, you can boot the -kernel with the 'no_console_suspend' parameter and try to log kernel -messages using the serial console. This may provide you with some information -about the reasons of the suspend (resume) failure. Alternatively, it may be -possible to use a FireWire port for debugging with firescope -(ftp://ftp.firstfloor.org/pub/ak/firescope/). On i386 it is also possible to +In case that hibernation does not work on your system even in the minimal +configuration and compiling more drivers as modules is not practical or some +modules cannot be unloaded, you can use one of the more advanced debugging +techniques to find the problem. First, if there is a serial port in your box, +you can boot the kernel with the 'no_console_suspend' parameter and try to log +kernel messages using the serial console. This may provide you with some +information about the reasons of the suspend (resume) failure. Alternatively, +it may be possible to use a FireWire port for debugging with firescope +(ftp://ftp.firstfloor.org/pub/ak/firescope/). On x86 it is also possible to use the PM_TRACE mechanism documented in Documentation/s2ram.txt . 2. Testing suspend to RAM (STR) @@ -91,16 +180,25 @@ use the PM_TRACE mechanism documented in Documentation/s2ram.txt . To verify that the STR works, it is generally more convenient to use the s2ram tool available from http://suspend.sf.net and documented at http://en.opensuse.org/s2ram . However, before doing that it is recommended to -carry out the procedure described in section 1. - -Assume you have resolved the problems with the STD and you have found some -failing drivers. These drivers are also likely to fail during the STR or -during the resume, so it is better to unload them every time before the STR -transition. Now, you can follow the instructions at -http://en.opensuse.org/s2ram to test the system, but if it does not work -"out of the box", you may need to boot it with "init=/bin/bash" and test -s2ram in the minimal configuration. In that case, you may be able to search -for failing drivers by following the procedure analogous to the one described in -1b). If you find some failing drivers, you will have to unload them every time -before the STR transition (ie. before you run s2ram), and please report the -problems with them. +carry out STR testing using the facility described in section 1. + +Namely, after writing "freezer", "devices", "platform", "processors", or "core" +into /sys/power/pm_test (available if the kernel is compiled with +CONFIG_PM_DEBUG set) the suspend code will work in the test mode corresponding +to given string. The STR test modes are defined in the same way as for +hibernation, so please refer to Section 1 for more information about them. In +particular, the "core" test allows you to test everything except for the actual +invocation of the platform firmware in order to put the system into the sleep +state. + +Among other things, the testing with the help of /sys/power/pm_test may allow +you to identify drivers that fail to suspend or resume their devices. They +should be unloaded every time before an STR transition. + +Next, you can follow the instructions at http://en.opensuse.org/s2ram to test +the system, but if it does not work "out of the box", you may need to boot it +with "init=/bin/bash" and test s2ram in the minimal configuration. In that +case, you may be able to search for failing drivers by following the procedure +analogous to the one described in section 1. If you find some failing drivers, +you will have to unload them every time before an STR transition (ie. before +you run s2ram), and please report the problems with them. diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index d0e79d5820a5..c53d26361919 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt @@ -502,52 +502,3 @@ If the CPU can have a "cpufreq" driver, there also may be opportunities to shift to lower voltage settings and reduce the power cost of executing a given number of instructions. (Without voltage adjustment, it's rare for cpufreq to save much power; the cost-per-instruction must go down.) - - -/sys/devices/.../power/state files -================================== -For now you can also test some of this functionality using sysfs. - - DEPRECATED: USE "power/state" ONLY FOR DRIVER TESTING, AND - AVOID USING dev->power.power_state IN DRIVERS. - - THESE WILL BE REMOVED. IF THE "power/state" FILE GETS REPLACED, - IT WILL BECOME SOMETHING COUPLED TO THE BUS OR DRIVER. - -In each device's directory, there is a 'power' directory, which contains -at least a 'state' file. The value of this field is effectively boolean, -PM_EVENT_ON or PM_EVENT_SUSPEND. - - * Reading from this file displays a value corresponding to - the power.power_state.event field. All nonzero values are - displayed as "2", corresponding to a low power state; zero - is displayed as "0", corresponding to normal operation. - - * Writing to this file initiates a transition using the - specified event code number; only '0', '2', and '3' are - accepted (without a newline); '2' and '3' are both - mapped to PM_EVENT_SUSPEND. - -On writes, the PM core relies on that recorded event code and the device/bus -capabilities to determine whether it uses a partial suspend() or resume() -sequence to change things so that the recorded event corresponds to the -numeric parameter. - - - If the bus requires the irqs-disabled suspend_late()/resume_early() - phases, writes fail because those operations are not supported here. - - - If the recorded value is the expected value, nothing is done. - - - If the recorded value is nonzero, the device is partially resumed, - using the bus.resume() and/or class.resume() methods. - - - If the target value is nonzero, the device is partially suspended, - using the class.suspend() and/or bus.suspend() methods and the - PM_EVENT_SUSPEND message. - -Drivers have no way to tell whether their suspend() and resume() calls -have come through the sysfs power/state file or as part of entering a -system sleep state, except that when accessed through sysfs the normal -parent/child sequencing rules are ignored. Drivers (such as bus, bridge, -or hub drivers) which expose child devices may need to enforce those rules -on their own. diff --git a/Documentation/power/drivers-testing.txt b/Documentation/power/drivers-testing.txt index e4bdcaee24e4..7f7a737f7f9f 100644 --- a/Documentation/power/drivers-testing.txt +++ b/Documentation/power/drivers-testing.txt @@ -6,9 +6,9 @@ Testing suspend and resume support in device drivers Unfortunately, to effectively test the support for the system-wide suspend and resume transitions in a driver, it is necessary to suspend and resume a fully functional system with this driver loaded. Moreover, that should be done -several times, preferably several times in a row, and separately for the suspend -to disk (STD) and the suspend to RAM (STR) transitions, because each of these -cases involves different ordering of operations and different interactions with +several times, preferably several times in a row, and separately for hibernation +(aka suspend to disk or STD) and suspend to RAM (STR), because each of these +cases involves slightly different operations and different interactions with the machine's BIOS. Of course, for this purpose the test system has to be known to suspend and @@ -22,20 +22,24 @@ for more information about the debugging of suspend/resume functionality. Once you have resolved the suspend/resume-related problems with your test system without the new driver, you are ready to test it: -a) Build the driver as a module, load it and try the STD in the test mode (see: -Documents/power/basic-pm-debugging.txt, 1a)). +a) Build the driver as a module, load it and try the test modes of hibernation + (see: Documents/power/basic-pm-debugging.txt, 1). -b) Load the driver and attempt to suspend to disk in the "reboot", "shutdown" -and "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1). +b) Load the driver and attempt to hibernate in the "reboot", "shutdown" and + "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1). -c) Compile the driver directly into the kernel and try the STD in the test mode. +c) Compile the driver directly into the kernel and try the test modes of + hibernation. -d) Attempt to suspend to disk with the driver compiled directly into the kernel -in the "reboot", "shutdown" and "platform" modes. +d) Attempt to hibernate with the driver compiled directly into the kernel + in the "reboot", "shutdown" and "platform" modes. -e) Attempt to suspend to RAM using the s2ram tool with the driver loaded (see: -Documents/power/basic-pm-debugging.txt, 2). As far as the STR tests are -concerned, it should not matter whether or not the driver is built as a module. +e) Try the test modes of suspend (see: Documents/power/basic-pm-debugging.txt, + 2). [As far as the STR tests are concerned, it should not matter whether or + not the driver is built as a module.] + +f) Attempt to suspend to RAM using the s2ram tool with the driver loaded + (see: Documents/power/basic-pm-debugging.txt, 2). Each of the above tests should be repeated several times and the STD tests should be mixed with the STR tests. If any of them fails, the driver cannot be diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt index 9293e4bc857c..ae1b7ec07684 100644 --- a/Documentation/power/notifiers.txt +++ b/Documentation/power/notifiers.txt @@ -28,6 +28,14 @@ PM_POST_HIBERNATION The system memory state has been restored from a hibernation. Device drivers' .resume() callbacks have been executed and tasks have been thawed. +PM_RESTORE_PREPARE The system is going to restore a hibernation image. + If all goes well the restored kernel will issue a + PM_POST_HIBERNATION notification. + +PM_POST_RESTORE An error occurred during the hibernation restore. + Device drivers' .resume() callbacks have been executed + and tasks have been thawed. + PM_SUSPEND_PREPARE The system is preparing for a suspend. PM_POST_SUSPEND The system has just resumed or an error occured during diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index e00c6cf09e85..7b99636564c8 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -14,7 +14,7 @@ are going to develop your own suspend/resume utilities. The interface consists of a character device providing the open(), release(), read(), and write() operations as well as several ioctl() -commands defined in kernel/power/power.h. The major and minor +commands defined in include/linux/suspend_ioctls.h . The major and minor numbers of the device are, respectively, 10 and 231, and they can be read from /sys/class/misc/snapshot/dev. @@ -27,17 +27,17 @@ once at a time. The ioctl() commands recognized by the device are: SNAPSHOT_FREEZE - freeze user space processes (the current process is - not frozen); this is required for SNAPSHOT_ATOMIC_SNAPSHOT + not frozen); this is required for SNAPSHOT_CREATE_IMAGE and SNAPSHOT_ATOMIC_RESTORE to succeed SNAPSHOT_UNFREEZE - thaw user space processes frozen by SNAPSHOT_FREEZE -SNAPSHOT_ATOMIC_SNAPSHOT - create a snapshot of the system memory; the +SNAPSHOT_CREATE_IMAGE - create a snapshot of the system memory; the last argument of ioctl() should be a pointer to an int variable, the value of which will indicate whether the call returned after creating the snapshot (1) or after restoring the system memory state from it (0) (after resume the system finds itself finishing the - SNAPSHOT_ATOMIC_SNAPSHOT ioctl() again); after the snapshot + SNAPSHOT_CREATE_IMAGE ioctl() again); after the snapshot has been created the read() operation can be used to transfer it out of the kernel @@ -49,39 +49,37 @@ SNAPSHOT_ATOMIC_RESTORE - restore the system memory state from the SNAPSHOT_FREE - free memory allocated for the snapshot image -SNAPSHOT_SET_IMAGE_SIZE - set the preferred maximum size of the image +SNAPSHOT_PREF_IMAGE_SIZE - set the preferred maximum size of the image (the kernel will do its best to ensure the image size will not exceed this number, but if it turns out to be impossible, the kernel will create the smallest image possible) -SNAPSHOT_AVAIL_SWAP - return the amount of available swap in bytes (the last - argument should be a pointer to an unsigned int variable that will +SNAPSHOT_GET_IMAGE_SIZE - return the actual size of the hibernation image + +SNAPSHOT_AVAIL_SWAP_SIZE - return the amount of available swap in bytes (the + last argument should be a pointer to an unsigned int variable that will contain the result if the call is successful). -SNAPSHOT_GET_SWAP_PAGE - allocate a swap page from the resume partition +SNAPSHOT_ALLOC_SWAP_PAGE - allocate a swap page from the resume partition (the last argument should be a pointer to a loff_t variable that will contain the swap page offset if the call is successful) -SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated with - SNAPSHOT_GET_SWAP_PAGE - -SNAPSHOT_SET_SWAP_FILE - set the resume partition (the last ioctl() argument - should specify the device's major and minor numbers in the old - two-byte format, as returned by the stat() function in the .st_rdev - member of the stat structure) +SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated by + SNAPSHOT_ALLOC_SWAP_PAGE SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in <PAGE_SIZE> units) from the beginning of the partition at which the swap header is located (the last ioctl() argument should point to a struct - resume_swap_area, as defined in kernel/power/power.h, containing the - resume device specification, as for the SNAPSHOT_SET_SWAP_FILE ioctl(), - and the offset); for swap partitions the offset is always 0, but it is - different to zero for swap files (please see - Documentation/swsusp-and-swap-files.txt for details). - The SNAPSHOT_SET_SWAP_AREA ioctl() is considered as a replacement for - SNAPSHOT_SET_SWAP_FILE which is regarded as obsolete. It is - recommended to always use this call, because the code to set the resume - partition may be removed from future kernels + resume_swap_area, as defined in kernel/power/suspend_ioctls.h, + containing the resume device specification and the offset); for swap + partitions the offset is always 0, but it is different from zero for + swap files (see Documentation/swsusp-and-swap-files.txt for details). + +SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, + depending on the argument value (enable, if the argument is nonzero) + +SNAPSHOT_POWER_OFF - make the kernel transition the system to the hibernation + state (eg. ACPI S4) using the platform (eg. ACPI) driver SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to immediately enter the suspend-to-RAM state, so this call must always @@ -93,24 +91,6 @@ SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to to resume the system from RAM if there's enough battery power or restore its state on the basis of the saved suspend image otherwise) -SNAPSHOT_PMOPS - enable the usage of the hibernation_ops->prepare, - hibernate_ops->enter and hibernation_ops->finish methods (the in-kernel - swsusp knows these as the "platform method") which are needed on many - machines to (among others) speed up the resume by letting the BIOS skip - some steps or to let the system recognise the correct state of the - hardware after the resume (in particular on many machines this ensures - that unplugged AC adapters get correctly detected and that kacpid does - not run wild after the resume). The last ioctl() argument can take one - of the three values, defined in kernel/power/power.h: - PMOPS_PREPARE - make the kernel carry out the - hibernation_ops->prepare() operation - PMOPS_ENTER - make the kernel power off the system by calling - hibernation_ops->enter() - PMOPS_FINISH - make the kernel carry out the - hibernation_ops->finish() operation - Note that the actual constants are misnamed because they surface - internal kernel implementation details that have changed. - The device's read() operation can be used to transfer the snapshot image from the kernel. It has the following limitations: - you cannot read() more than one virtual memory page at a time @@ -122,7 +102,7 @@ The device's write() operation is used for uploading the system memory snapshot into the kernel. It has the same limitations as the read() operation. The release() operation frees all memory allocated for the snapshot image -and all swap pages allocated with SNAPSHOT_GET_SWAP_PAGE (if any). +and all swap pages allocated with SNAPSHOT_ALLOC_SWAP_PAGE (if any). Thus it is not necessary to use either SNAPSHOT_FREE or SNAPSHOT_FREE_SWAP_PAGES before closing the device (in fact it will also unfreeze user space processes frozen by SNAPSHOT_UNFREEZE if they are @@ -133,16 +113,12 @@ snapshot image from/to the kernel will use a swap parition, called the resume partition, or a swap file as storage space (if a swap file is used, the resume partition is the partition that holds this file). However, this is not really required, as they can use, for example, a special (blank) suspend partition or -a file on a partition that is unmounted before SNAPSHOT_ATOMIC_SNAPSHOT and +a file on a partition that is unmounted before SNAPSHOT_CREATE_IMAGE and mounted afterwards. -These utilities SHOULD NOT make any assumptions regarding the ordering of -data within the snapshot image, except for the image header that MAY be -assumed to start with an swsusp_info structure, as specified in -kernel/power/power.h. This structure MAY be used by the userland utilities -to obtain some information about the snapshot image, such as the size -of the snapshot image, including the metadata and the header itself, -contained in the .size member of swsusp_info. +These utilities MUST NOT make any assumptions regarding the ordering of +data within the snapshot image. The contents of the image are entirely owned +by the kernel and its structure may be changed in future kernel releases. The snapshot image MUST be written to the kernel unaltered (ie. all of the image data, metadata and header MUST be written in _exactly_ the same amount, form @@ -159,7 +135,7 @@ means, such as checksums, to ensure the integrity of the snapshot image. The suspending and resuming utilities MUST lock themselves in memory, preferrably using mlockall(), before calling SNAPSHOT_FREEZE. -The suspending utility MUST check the value stored by SNAPSHOT_ATOMIC_SNAPSHOT +The suspending utility MUST check the value stored by SNAPSHOT_CREATE_IMAGE in the memory location pointed to by the last argument of ioctl() and proceed in accordance with it: 1. If the value is 1 (ie. the system memory snapshot has just been @@ -173,7 +149,7 @@ in accordance with it: image has been saved. (b) The suspending utility SHOULD NOT attempt to perform any file system operations (including reads) on the file systems - that were mounted before SNAPSHOT_ATOMIC_SNAPSHOT has been + that were mounted before SNAPSHOT_CREATE_IMAGE has been called. However, it MAY mount a file system that was not mounted at that time and perform some operations on it (eg. use it for saving the image). diff --git a/Documentation/power_supply_class.txt b/Documentation/power_supply_class.txt index 9758cf433c06..a8686e5a6857 100644 --- a/Documentation/power_supply_class.txt +++ b/Documentation/power_supply_class.txt @@ -87,6 +87,10 @@ batteries use voltage for very approximated calculation of capacity. Battery driver also can use this attribute just to inform userspace about maximal and minimal voltage thresholds of a given battery. +VOLTAGE_MAX, VOLTAGE_MIN - same as _DESIGN voltage values except that +these ones should be used if hardware could only guess (measure and +retain) the thresholds of a given power supply. + CHARGE_FULL_DESIGN, CHARGE_EMPTY_DESIGN - design charge values, when battery considered full/empty. @@ -100,8 +104,6 @@ age)". I.e. these attributes represents real thresholds, not design values. ENERGY_FULL, ENERGY_EMPTY - same as above but for energy. CAPACITY - capacity in percents. -CAPACITY_LEVEL - capacity level. This corresponds to -POWER_SUPPLY_CAPACITY_LEVEL_*. TEMP - temperature of the power supply. TEMP_AMBIENT - ambient temperature. diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX index 94a3c577b083..3be84aa38dfe 100644 --- a/Documentation/powerpc/00-INDEX +++ b/Documentation/powerpc/00-INDEX @@ -28,3 +28,6 @@ sound.txt - info on sound support under Linux/PPC zImage_layout.txt - info on the kernel images for Linux/PPC +qe_firmware.txt + - describes the layout of firmware binaries for the Freescale QUICC + Engine and the code that parses and uploads the microcode therein. diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index ac1be25c1e25..b5e46efeba84 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -52,7 +52,11 @@ Table of Contents i) Freescale QUICC Engine module (QE) j) CFI or JEDEC memory-mapped NOR flash k) Global Utilities Block - l) Xilinx IP cores + l) Freescale Communications Processor Module + m) Chipselect/Local Bus + n) 4xx/Axon EMAC ethernet nodes + o) Xilinx IP cores + p) Freescale Synchronous Serial Interface VII - Specifying interrupt information for devices 1) interrupts property @@ -671,10 +675,10 @@ device or bus to be described by the device tree. In general, the format of an address for a device is defined by the parent bus type, based on the #address-cells and #size-cells -property. In the absence of such a property, the parent's parent -values are used, etc... The kernel requires the root node to have -those properties defining addresses format for devices directly mapped -on the processor bus. +properties. Note that the parent's parent definitions of #address-cells +and #size-cells are not inhereted so every node with children must specify +them. The kernel requires the root node to have those properties defining +addresses format for devices directly mapped on the processor bus. Those 2 properties define 'cells' for representing an address and a size. A "cell" is a 32-bit number. For example, if both contain 2 @@ -711,13 +715,14 @@ define a bus type with a more complex address format, including things like address space bits, you'll have to add a bus translator to the prom_parse.c file of the recent kernels for your bus type. -The "reg" property only defines addresses and sizes (if #size-cells -is non-0) within a given bus. In order to translate addresses upward +The "reg" property only defines addresses and sizes (if #size-cells is +non-0) within a given bus. In order to translate addresses upward (that is into parent bus addresses, and possibly into CPU physical addresses), all busses must contain a "ranges" property. If the "ranges" property is missing at a given level, it's assumed that -translation isn't possible. The format of the "ranges" property for a -bus is a list of: +translation isn't possible, i.e., the registers are not visible on the +parent bus. The format of the "ranges" property for a bus is a list +of: bus address, parent bus address, size @@ -735,6 +740,10 @@ fit in a single 32-bit word. New 32-bit powerpc boards should use a 1/1 format, unless the processor supports physical addresses greater than 32-bits, in which case a 2/1 format is recommended. +Alternatively, the "ranges" property may be empty, indicating that the +registers are visible on the parent bus using an identity mapping +translation. In other words, the parent bus address space is the same +as the child bus address space. 2) Note about "compatible" properties ------------------------------------- @@ -1218,16 +1227,14 @@ platforms are moved over to use the flattened-device-tree model. Required properties: - reg : Offset and length of the register set for the device - - device_type : Should be "mdio" - compatible : Should define the compatible device type for the - mdio. Currently, this is most likely to be "gianfar" + mdio. Currently, this is most likely to be "fsl,gianfar-mdio" Example: mdio@24520 { reg = <24520 20>; - device_type = "mdio"; - compatible = "gianfar"; + compatible = "fsl,gianfar-mdio"; ethernet-phy@0 { ...... @@ -1254,6 +1261,10 @@ platforms are moved over to use the flattened-device-tree model. services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. + - fixed-link : <a b c d e> where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000, d is pause - 0 no + pause, 1 pause, e is asym_pause - 0 no asym_pause, 1 asym_pause. Recommended properties: @@ -1408,7 +1419,6 @@ platforms are moved over to use the flattened-device-tree model. Example multi port host USB controller device node : usb@22000 { - device_type = "usb"; compatible = "fsl-usb2-mph"; reg = <22000 1000>; #address-cells = <1>; @@ -1422,7 +1432,6 @@ platforms are moved over to use the flattened-device-tree model. Example dual role USB controller device node : usb@23000 { - device_type = "usb"; compatible = "fsl-usb2-dr"; reg = <23000 1000>; #address-cells = <1>; @@ -1534,7 +1543,7 @@ platforms are moved over to use the flattened-device-tree model. i) Root QE device Required properties: - - device_type : should be "qe"; + - compatible : should be "fsl,qe"; - model : precise model of the QE, Can be "QE", "CPM", or "CPM2" - reg : offset and length of the device registers. - bus-frequency : the clock frequency for QUICC Engine. @@ -1548,8 +1557,7 @@ platforms are moved over to use the flattened-device-tree model. #address-cells = <1>; #size-cells = <1>; #interrupt-cells = <2>; - device_type = "qe"; - model = "QE"; + compatible = "fsl,qe"; ranges = <0 e0100000 00100000>; reg = <e0100000 480>; brg-frequency = <0>; @@ -1560,8 +1568,8 @@ platforms are moved over to use the flattened-device-tree model. ii) SPI (Serial Peripheral Interface) Required properties: - - device_type : should be "spi". - - compatible : should be "fsl_spi". + - cell-index : SPI controller index. + - compatible : should be "fsl,spi". - mode : the SPI operation mode, it can be "cpu" or "cpu-qe". - reg : Offset and length of the register set for the device - interrupts : <a b> where a is the interrupt number and b is a @@ -1574,8 +1582,8 @@ platforms are moved over to use the flattened-device-tree model. Example: spi@4c0 { - device_type = "spi"; - compatible = "fsl_spi"; + cell-index = <0>; + compatible = "fsl,spi"; reg = <4c0 40>; interrupts = <82 0>; interrupt-parent = <700>; @@ -1586,7 +1594,6 @@ platforms are moved over to use the flattened-device-tree model. iii) USB (Universal Serial Bus Controller) Required properties: - - device_type : should be "usb". - compatible : could be "qe_udc" or "fhci-hcd". - mode : the could be "host" or "slave". - reg : Offset and length of the register set for the device @@ -1600,7 +1607,6 @@ platforms are moved over to use the flattened-device-tree model. Example(slave): usb@6c0 { - device_type = "usb"; compatible = "qe_udc"; reg = <6c0 40>; interrupts = <8b 0>; @@ -1613,7 +1619,7 @@ platforms are moved over to use the flattened-device-tree model. Required properties: - device_type : should be "network", "hldc", "uart", "transparent" - "bisync" or "atm". + "bisync", "atm", or "serial". - compatible : could be "ucc_geth" or "fsl_atm" and so on. - model : should be "UCC". - device-id : the ucc number(1-8), corresponding to UCCx in UM. @@ -1626,6 +1632,26 @@ platforms are moved over to use the flattened-device-tree model. - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. - pio-handle : The phandle for the Parallel I/O port configuration. + - port-number : for UART drivers, the port number to use, between 0 and 3. + This usually corresponds to the /dev/ttyQE device, e.g. <0> = /dev/ttyQE0. + The port number is added to the minor number of the device. Unlike the + CPM UART driver, the port-number is required for the QE UART driver. + - soft-uart : for UART drivers, if specified this means the QE UART device + driver should use "Soft-UART" mode, which is needed on some SOCs that have + broken UART hardware. Soft-UART is provided via a microcode upload. + - rx-clock-name: the UCC receive clock source + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively + - tx-clock-name: the UCC transmit clock source + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively + The following two properties are deprecated. rx-clock has been replaced + with rx-clock-name, and tx-clock has been replaced with tx-clock-name. + Drivers that currently use the deprecated properties should continue to + do so, in order to support older device trees, but they should be updated + to check for the new properties first. - rx-clock : represents the UCC receive clock source. 0x00 : clock source is disabled; 0x1~0x10 : clock source is BRG1~BRG16 respectively; @@ -1645,8 +1671,9 @@ platforms are moved over to use the flattened-device-tree model. MAC addresses passed by the firmware when no information other than indices is available to associate an address with a device. - phy-connection-type : a string naming the controller/PHY interface type, - i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id", "tbi", - or "rtbi". + i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id" (Internal + Delay), "rgmii-txid" (delay on TX only), "rgmii-rxid" (delay on RX only), + "tbi", or "rtbi". Example: ucc@2000 { @@ -1753,7 +1780,7 @@ platforms are moved over to use the flattened-device-tree model. vii) Multi-User RAM (MURAM) Required properties: - - device_type : should be "muram". + - compatible : should be "fsl,qe-muram", "fsl,cpm-muram". - mode : the could be "host" or "slave". - ranges : Should be defined as specified in 1) to describe the translation of MURAM addresses. @@ -1763,14 +1790,42 @@ platforms are moved over to use the flattened-device-tree model. Example: muram@10000 { - device_type = "muram"; + compatible = "fsl,qe-muram", "fsl,cpm-muram"; ranges = <0 00010000 0000c000>; data-only@0{ + compatible = "fsl,qe-muram-data", + "fsl,cpm-muram-data"; reg = <0 c000>; }; }; + viii) Uploaded QE firmware + + If a new firwmare has been uploaded to the QE (usually by the + boot loader), then a 'firmware' child node should be added to the QE + node. This node provides information on the uploaded firmware that + device drivers may need. + + Required properties: + - id: The string name of the firmware. This is taken from the 'id' + member of the qe_firmware structure of the uploaded firmware. + Device drivers can search this string to determine if the + firmware they want is already present. + - extended-modes: The Extended Modes bitfield, taken from the + firmware binary. It is a 64-bit number represented + as an array of two 32-bit numbers. + - virtual-traps: The virtual traps, taken from the firmware binary. + It is an array of 8 32-bit numbers. + + Example: + + firmware { + id = "Soft-UART"; + extended-modes = <0 0>; + virtual-traps = <0 0 0 0 0 0 0 0>; + } + j) CFI or JEDEC memory-mapped NOR flash Flash chips (Memory Technology Devices) are often used for solid state @@ -2074,8 +2129,7 @@ platforms are moved over to use the flattened-device-tree model. Example: localbus@f0010100 { - compatible = "fsl,mpc8272ads-localbus", - "fsl,mpc8272-localbus", + compatible = "fsl,mpc8272-localbus", "fsl,pq2-localbus"; #address-cells = <2>; #size-cells = <1>; @@ -2253,7 +2307,7 @@ platforms are moved over to use the flattened-device-tree model. available. For Axon: 0x0000012a - l) Xilinx IP cores + o) Xilinx IP cores The Xilinx EDK toolchain ships with a set of IP cores (devices) for use in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range @@ -2275,7 +2329,7 @@ platforms are moved over to use the flattened-device-tree model. properties of the device node. In general, device nodes for IP-cores will take the following form: - (name)@(base-address) { + (name): (generic-name)@(base-address) { compatible = "xlnx,(ip-core-name)-(HW_VER)" [, (list of compatible devices), ...]; reg = <(baseaddr) (size)>; @@ -2285,6 +2339,9 @@ platforms are moved over to use the flattened-device-tree model. xlnx,(parameter2) = <(int-value)>; }; + (generic-name): an open firmware-style name that describes the + generic class of device. Preferably, this is one word, such + as 'serial' or 'ethernet'. (ip-core-name): the name of the ip block (given after the BEGIN directive in system.mhs). Should be in lowercase and all underscores '_' converted to dashes '-'. @@ -2293,9 +2350,9 @@ platforms are moved over to use the flattened-device-tree model. dropped from the parameter name, the name is converted to lowercase and all underscore '_' characters are converted to dashes '-'. - (baseaddr): the C_BASEADDR parameter. + (baseaddr): the baseaddr parameter value (often named C_BASEADDR). (HW_VER): from the HW_VER parameter. - (size): equals C_HIGHADDR - C_BASEADDR + 1 + (size): the address range size (often C_HIGHADDR - C_BASEADDR + 1). Typically, the compatible list will include the exact IP core version followed by an older IP core version which implements the same @@ -2325,11 +2382,11 @@ platforms are moved over to use the flattened-device-tree model. becomes the following device tree node: - opb-uartlite-0@ec100000 { + opb_uartlite_0: serial@ec100000 { device_type = "serial"; compatible = "xlnx,opb-uartlite-1.00.b"; reg = <ec100000 10000>; - interrupt-parent = <&opb-intc>; + interrupt-parent = <&opb_intc_0>; interrupts = <1 0>; // got this from the opb_intc parameters current-speed = <d#115200>; // standard serial device prop clock-frequency = <d#50000000>; // standard serial device prop @@ -2338,16 +2395,19 @@ platforms are moved over to use the flattened-device-tree model. xlnx,use-parity = <0>; }; - Some IP cores actually implement 2 or more logical devices. In this case, - the device should still describe the whole IP core with a single node - and add a child node for each logical device. The ranges property can - be used to translate from parent IP-core to the registers of each device. - (Note: this makes the assumption that both logical devices have the same - bus binding. If this is not true, then separate nodes should be used for - each logical device). The 'cell-index' property can be used to enumerate - logical devices within an IP core. For example, the following is the - system.mhs entry for the dual ps2 controller found on the ml403 reference - design. + Some IP cores actually implement 2 or more logical devices. In + this case, the device should still describe the whole IP core with + a single node and add a child node for each logical device. The + ranges property can be used to translate from parent IP-core to the + registers of each device. In addition, the parent node should be + compatible with the bus type 'xlnx,compound', and should contain + #address-cells and #size-cells, as with any other bus. (Note: this + makes the assumption that both logical devices have the same bus + binding. If this is not true, then separate nodes should be used + for each logical device). The 'cell-index' property can be used to + enumerate logical devices within an IP core. For example, the + following is the system.mhs entry for the dual ps2 controller found + on the ml403 reference design. BEGIN opb_ps2_dual_ref PARAMETER INSTANCE = opb_ps2_dual_ref_0 @@ -2369,21 +2429,24 @@ platforms are moved over to use the flattened-device-tree model. It would result in the following device tree nodes: - opb_ps2_dual_ref_0@a9000000 { + opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "xlnx,compound"; ranges = <0 a9000000 2000>; // If this device had extra parameters, then they would // go here. ps2@0 { compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; reg = <0 40>; - interrupt-parent = <&opb-intc>; + interrupt-parent = <&opb_intc_0>; interrupts = <3 0>; cell-index = <0>; }; ps2@1000 { compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; reg = <1000 40>; - interrupt-parent = <&opb-intc>; + interrupt-parent = <&opb_intc_0>; interrupts = <3 0>; cell-index = <0>; }; @@ -2446,17 +2509,18 @@ platforms are moved over to use the flattened-device-tree model. Gives this device tree (some properties removed for clarity): - plb-v34-0 { + plb@0 { #address-cells = <1>; #size-cells = <1>; + compatible = "xlnx,plb-v34-1.02.a"; device_type = "ibm,plb"; ranges; // 1:1 translation - plb-bram-if-cntrl-0@ffff0000 { + plb_bram_if_cntrl_0: bram@ffff0000 { reg = <ffff0000 10000>; } - opb-v20-0 { + opb@20000000 { #address-cells = <1>; #size-cells = <1>; ranges = <20000000 20000000 20000000 @@ -2464,11 +2528,11 @@ platforms are moved over to use the flattened-device-tree model. 80000000 80000000 40000000 c0000000 c0000000 20000000>; - opb-uart16550-0@a0000000 { + opb_uart16550_0: serial@a0000000 { reg = <a00000000 2000>; }; - opb-intc-0@d1000fc0 { + opb_intc_0: interrupt-controller@d1000fc0 { reg = <d1000fc0 20>; }; }; @@ -2513,6 +2577,204 @@ platforms are moved over to use the flattened-device-tree model. Requred properties: - current-speed : Baud rate of uartlite + p) Freescale Synchronous Serial Interface + + The SSI is a serial device that communicates with audio codecs. It can + be programmed in AC97, I2S, left-justified, or right-justified modes. + + Required properties: + - compatible : compatible list, containing "fsl,ssi" + - cell-index : the SSI, <0> = SSI1, <1> = SSI2, and so on + - reg : offset and length of the register set for the device + - interrupts : <a b> where a is the interrupt number and b is a + field that represents an encoding of the sense and + level information for the interrupt. This should be + encoded based on the information in section 2) + depending on the type of interrupt controller you + have. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - fsl,mode : the operating mode for the SSI interface + "i2s-slave" - I2S mode, SSI is clock slave + "i2s-master" - I2S mode, SSI is clock master + "lj-slave" - left-justified mode, SSI is clock slave + "lj-master" - l.j. mode, SSI is clock master + "rj-slave" - right-justified mode, SSI is clock slave + "rj-master" - r.j., SSI is clock master + "ac97-slave" - AC97 mode, SSI is clock slave + "ac97-master" - AC97 mode, SSI is clock master + + Optional properties: + - codec-handle : phandle to a 'codec' node that defines an audio + codec connected to this SSI. This node is typically + a child of an I2C or other control node. + + Child 'codec' node required properties: + - compatible : compatible list, contains the name of the codec + + Child 'codec' node optional properties: + - clock-frequency : The frequency of the input clock, which typically + comes from an on-board dedicated oscillator. + + * Freescale 83xx DMA Controller + + Freescale PowerPC 83xx have on chip general purpose DMA controllers. + + Required properties: + + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma", where CHIP is the processor + (mpc8349, mpc8360, etc.) and the second is + "fsl,elo-dma" + - reg : <registers mapping for DMA general status reg> + - ranges : Should be defined as specified in 1) to describe the + DMA controller channels. + - cell-index : controller index. 0 for controller @ 0x8100 + - interrupts : <interrupt mapping for DMA IRQ> + - interrupt-parent : optional, if needed for interrupt mapping + + + - DMA channel nodes: + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma-channel", where CHIP is the processor + (mpc8349, mpc8350, etc.) and the second is + "fsl,elo-dma-channel" + - reg : <registers mapping for channel> + - cell-index : dma channel index starts at 0. + + Optional properties: + - interrupts : <interrupt mapping for DMA channel IRQ> + (on 83xx this is expected to be identical to + the interrupts property of the parent node) + - interrupt-parent : optional, if needed for interrupt mapping + + Example: + dma@82a8 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8349-dma", "fsl,elo-dma"; + reg = <82a8 4>; + ranges = <0 8100 1a4>; + interrupt-parent = <&ipic>; + interrupts = <47 8>; + cell-index = <0>; + dma-channel@0 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <0>; + reg = <0 80>; + }; + dma-channel@80 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <1>; + reg = <80 80>; + }; + dma-channel@100 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <2>; + reg = <100 80>; + }; + dma-channel@180 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <3>; + reg = <180 80>; + }; + }; + + * Freescale 85xx/86xx DMA Controller + + Freescale PowerPC 85xx/86xx have on chip general purpose DMA controllers. + + Required properties: + + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma", where CHIP is the processor + (mpc8540, mpc8540, etc.) and the second is + "fsl,eloplus-dma" + - reg : <registers mapping for DMA general status reg> + - cell-index : controller index. 0 for controller @ 0x21000, + 1 for controller @ 0xc000 + - ranges : Should be defined as specified in 1) to describe the + DMA controller channels. + + - DMA channel nodes: + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma-channel", where CHIP is the processor + (mpc8540, mpc8560, etc.) and the second is + "fsl,eloplus-dma-channel" + - cell-index : dma channel index starts at 0. + - reg : <registers mapping for channel> + - interrupts : <interrupt mapping for DMA channel IRQ> + - interrupt-parent : optional, if needed for interrupt mapping + + Example: + dma@21300 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8540-dma", "fsl,eloplus-dma"; + reg = <21300 4>; + ranges = <0 21100 200>; + cell-index = <0>; + dma-channel@0 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0 80>; + cell-index = <0>; + interrupt-parent = <&mpic>; + interrupts = <14 2>; + }; + dma-channel@80 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <80 80>; + cell-index = <1>; + interrupt-parent = <&mpic>; + interrupts = <15 2>; + }; + dma-channel@100 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <100 80>; + cell-index = <2>; + interrupt-parent = <&mpic>; + interrupts = <16 2>; + }; + dma-channel@180 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <180 80>; + cell-index = <3>; + interrupt-parent = <&mpic>; + interrupts = <17 2>; + }; + }; + + * Freescale 8xxx/3.0 Gb/s SATA nodes + + SATA nodes are defined to describe on-chip Serial ATA controllers. + Each SATA port should have its own node. + + Required properties: + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-sata", where CHIP is the processor + (mpc8315, mpc8379, etc.) and the second is + "fsl,pq-sata" + - interrupts : <interrupt mapping for SATA IRQ> + - cell-index : controller index. + 1 for controller @ 0x18000 + 2 for controller @ 0x19000 + 3 for controller @ 0x1a000 + 4 for controller @ 0x1b000 + + Optional properties: + - interrupt-parent : optional, if needed for interrupt mapping + - reg : <registers mapping> + + Example: + + sata@18000 { + compatible = "fsl,mpc8379-sata", "fsl,pq-sata"; + reg = <0x18000 0x1000>; + cell-index = <1>; + interrupts = <2c 8>; + interrupt-parent = < &ipic >; + }; + More devices will be defined as this spec matures. VII - Specifying interrupt information for devices diff --git a/Documentation/powerpc/qe_firmware.txt b/Documentation/powerpc/qe_firmware.txt new file mode 100644 index 000000000000..896266432d33 --- /dev/null +++ b/Documentation/powerpc/qe_firmware.txt @@ -0,0 +1,295 @@ + Freescale QUICC Engine Firmware Uploading + ----------------------------------------- + +(c) 2007 Timur Tabi <timur at freescale.com>, + Freescale Semiconductor + +Table of Contents +================= + + I - Software License for Firmware + + II - Microcode Availability + + III - Description and Terminology + + IV - Microcode Programming Details + + V - Firmware Structure Layout + + VI - Sample Code for Creating Firmware Files + +Revision Information +==================== + +November 30, 2007: Rev 1.0 - Initial version + +I - Software License for Firmware +================================= + +Each firmware file comes with its own software license. For information on +the particular license, please see the license text that is distributed with +the firmware. + +II - Microcode Availability +=========================== + +Firmware files are distributed through various channels. Some are available on +http://opensource.freescale.com. For other firmware files, please contact +your Freescale representative or your operating system vendor. + +III - Description and Terminology +================================ + +In this document, the term 'microcode' refers to the sequence of 32-bit +integers that compose the actual QE microcode. + +The term 'firmware' refers to a binary blob that contains the microcode as +well as other data that + + 1) describes the microcode's purpose + 2) describes how and where to upload the microcode + 3) specifies the values of various registers + 4) includes additional data for use by specific device drivers + +Firmware files are binary files that contain only a firmware. + +IV - Microcode Programming Details +=================================== + +The QE architecture allows for only one microcode present in I-RAM for each +RISC processor. To replace any current microcode, a full QE reset (which +disables the microcode) must be performed first. + +QE microcode is uploaded using the following procedure: + +1) The microcode is placed into I-RAM at a specific location, using the + IRAM.IADD and IRAM.IDATA registers. + +2) The CERCR.CIR bit is set to 0 or 1, depending on whether the firmware + needs split I-RAM. Split I-RAM is only meaningful for SOCs that have + QEs with multiple RISC processors, such as the 8360. Splitting the I-RAM + allows each processor to run a different microcode, effectively creating an + asymmetric multiprocessing (AMP) system. + +3) The TIBCR trap registers are loaded with the addresses of the trap handlers + in the microcode. + +4) The RSP.ECCR register is programmed with the value provided. + +5) If necessary, device drivers that need the virtual traps and extended mode + data will use them. + +Virtual Microcode Traps + +These virtual traps are conditional branches in the microcode. These are +"soft" provisional introduced in the ROMcode in order to enable higher +flexibility and save h/w traps If new features are activated or an issue is +being fixed in the RAM package utilizing they should be activated. This data +structure signals the microcode which of these virtual traps is active. + +This structure contains 6 words that the application should copy to some +specific been defined. This table describes the structure. + + --------------------------------------------------------------- + | Offset in | | Destination Offset | Size of | + | array | Protocol | within PRAM | Operand | + --------------------------------------------------------------| + | 0 | Ethernet | 0xF8 | 4 bytes | + | | interworking | | | + --------------------------------------------------------------- + | 4 | ATM | 0xF8 | 4 bytes | + | | interworking | | | + --------------------------------------------------------------- + | 8 | PPP | 0xF8 | 4 bytes | + | | interworking | | | + --------------------------------------------------------------- + | 12 | Ethernet RX | 0x22 | 1 byte | + | | Distributor Page | | | + --------------------------------------------------------------- + | 16 | ATM Globtal | 0x28 | 1 byte | + | | Params Table | | | + --------------------------------------------------------------- + | 20 | Insert Frame | 0xF8 | 4 bytes | + --------------------------------------------------------------- + + +Extended Modes + +This is a double word bit array (64 bits) that defines special functionality +which has an impact on the softwarew drivers. Each bit has its own impact +and has special instructions for the s/w associated with it. This structure is +described in this table: + + ----------------------------------------------------------------------- + | Bit # | Name | Description | + ----------------------------------------------------------------------- + | 0 | General | Indicates that prior to each host command | + | | push command | given by the application, the software must | + | | | assert a special host command (push command)| + | | | CECDR = 0x00800000. | + | | | CECR = 0x01c1000f. | + ----------------------------------------------------------------------- + | 1 | UCC ATM | Indicates that after issuing ATM RX INIT | + | | RX INIT | command, the host must issue another special| + | | push command | command (push command) and immediately | + | | | following that re-issue the ATM RX INIT | + | | | command. (This makes the sequence of | + | | | initializing the ATM receiver a sequence of | + | | | three host commands) | + | | | CECDR = 0x00800000. | + | | | CECR = 0x01c1000f. | + ----------------------------------------------------------------------- + | 2 | Add/remove | Indicates that following the specific host | + | | command | command: "Add/Remove entry in Hash Lookup | + | | validation | Table" used in Interworking setup, the user | + | | | must issue another command. | + | | | CECDR = 0xce000003. | + | | | CECR = 0x01c10f58. | + ----------------------------------------------------------------------- + | 3 | General push | Indicates that the s/w has to initialize | + | | command | some pointers in the Ethernet thread pages | + | | | which are used when Header Compression is | + | | | activated. The full details of these | + | | | pointers is located in the software drivers.| + ----------------------------------------------------------------------- + | 4 | General push | Indicates that after issuing Ethernet TX | + | | command | INIT command, user must issue this command | + | | | for each SNUM of Ethernet TX thread. | + | | | CECDR = 0x00800003. | + | | | CECR = 0x7'b{0}, 8'b{Enet TX thread SNUM}, | + | | | 1'b{1}, 12'b{0}, 4'b{1} | + ----------------------------------------------------------------------- + | 5 - 31 | N/A | Reserved, set to zero. | + ----------------------------------------------------------------------- + +V - Firmware Structure Layout +============================== + +QE microcode from Freescale is typically provided as a header file. This +header file contains macros that define the microcode binary itself as well as +some other data used in uploading that microcode. The format of these files +do not lend themselves to simple inclusion into other code. Hence, +the need for a more portable format. This section defines that format. + +Instead of distributing a header file, the microcode and related data are +embedded into a binary blob. This blob is passed to the qe_upload_firmware() +function, which parses the blob and performs everything necessary to upload +the microcode. + +All integers are big-endian. See the comments for function +qe_upload_firmware() for up-to-date implementation information. + +This structure supports versioning, where the version of the structure is +embedded into the structure itself. To ensure forward and backwards +compatibility, all versions of the structure must use the same 'qe_header' +structure at the beginning. + +'header' (type: struct qe_header): + The 'length' field is the size, in bytes, of the entire structure, + including all the microcode embedded in it, as well as the CRC (if + present). + + The 'magic' field is an array of three bytes that contains the letters + 'Q', 'E', and 'F'. This is an identifier that indicates that this + structure is a QE Firmware structure. + + The 'version' field is a single byte that indicates the version of this + structure. If the layout of the structure should ever need to be + changed to add support for additional types of microcode, then the + version number should also be changed. + +The 'id' field is a null-terminated string(suitable for printing) that +identifies the firmware. + +The 'count' field indicates the number of 'microcode' structures. There +must be one and only one 'microcode' structure for each RISC processor. +Therefore, this field also represents the number of RISC processors for this +SOC. + +The 'soc' structure contains the SOC numbers and revisions used to match +the microcode to the SOC itself. Normally, the microcode loader should +check the data in this structure with the SOC number and revisions, and +only upload the microcode if there's a match. However, this check is not +made on all platforms. + +Although it is not recommended, you can specify '0' in the soc.model +field to skip matching SOCs altogether. + +The 'model' field is a 16-bit number that matches the actual SOC. The +'major' and 'minor' fields are the major and minor revision numbrs, +respectively, of the SOC. + +For example, to match the 8323, revision 1.0: + soc.model = 8323 + soc.major = 1 + soc.minor = 0 + +'padding' is neccessary for structure alignment. This field ensures that the +'extended_modes' field is aligned on a 64-bit boundary. + +'extended_modes' is a bitfield that defines special functionality which has an +impact on the device drivers. Each bit has its own impact and has special +instructions for the driver associated with it. This field is stored in +the QE library and available to any driver that calles qe_get_firmware_info(). + +'vtraps' is an array of 8 words that contain virtual trap values for each +virtual traps. As with 'extended_modes', this field is stored in the QE +library and available to any driver that calles qe_get_firmware_info(). + +'microcode' (type: struct qe_microcode): + For each RISC processor there is one 'microcode' structure. The first + 'microcode' structure is for the first RISC, and so on. + + The 'id' field is a null-terminated string suitable for printing that + identifies this particular microcode. + + 'traps' is an array of 16 words that contain hardware trap values + for each of the 16 traps. If trap[i] is 0, then this particular + trap is to be ignored (i.e. not written to TIBCR[i]). The entire value + is written as-is to the TIBCR[i] register, so be sure to set the EN + and T_IBP bits if necessary. + + 'eccr' is the value to program into the ECCR register. + + 'iram_offset' is the offset into IRAM to start writing the + microcode. + + 'count' is the number of 32-bit words in the microcode. + + 'code_offset' is the offset, in bytes, from the beginning of this + structure where the microcode itself can be found. The first + microcode binary should be located immediately after the 'microcode' + array. + + 'major', 'minor', and 'revision' are the major, minor, and revision + version numbers, respectively, of the microcode. If all values are 0, + then these fields are ignored. + + 'reserved' is necessary for structure alignment. Since 'microcode' + is an array, the 64-bit 'extended_modes' field needs to be aligned + on a 64-bit boundary, and this can only happen if the size of + 'microcode' is a multiple of 8 bytes. To ensure that, we add + 'reserved'. + +After the last microcode is a 32-bit CRC. It can be calculated using +this algorithm: + +u32 crc32(const u8 *p, unsigned int len) +{ + unsigned int i; + u32 crc = 0; + + while (len--) { + crc ^= *p++; + for (i = 0; i < 8; i++) + crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0); + } + return crc; +} + +VI - Sample Code for Creating Firmware Files +============================================ + +A Python program that creates firmware binaries from the header files normally +distributed by Freescale can be found on http://opensource.freescale.com. diff --git a/Documentation/rtc.txt b/Documentation/rtc.txt index e20b19c1b60d..8deffcd68cb8 100644 --- a/Documentation/rtc.txt +++ b/Documentation/rtc.txt @@ -182,8 +182,8 @@ driver returns ENOIOCTLCMD. Some common examples: since the frequency is stored in the irq_freq member of the rtc_device structure. Your driver needs to initialize the irq_freq member during init. Make sure you check the requested frequency is in range of your - hardware in the irq_set_freq function. If you cannot actually change - the frequency, just return -ENOTTY. + hardware in the irq_set_freq function. If it isn't, return -EINVAL. If + you cannot actually change the frequency, do not define irq_set_freq. If all else fails, check out the rtc-test.c driver! @@ -268,8 +268,8 @@ int main(int argc, char **argv) /* This read will block */ retval = read(fd, &data, sizeof(unsigned long)); if (retval == -1) { - perror("read"); - exit(errno); + perror("read"); + exit(errno); } fprintf(stderr, " %d",i); fflush(stderr); @@ -326,11 +326,11 @@ test_READ: rtc_tm.tm_sec %= 60; rtc_tm.tm_min++; } - if (rtc_tm.tm_min == 60) { + if (rtc_tm.tm_min == 60) { rtc_tm.tm_min = 0; rtc_tm.tm_hour++; } - if (rtc_tm.tm_hour == 24) + if (rtc_tm.tm_hour == 24) rtc_tm.tm_hour = 0; retval = ioctl(fd, RTC_ALM_SET, &rtc_tm); @@ -407,8 +407,8 @@ test_PIE: "\n...Periodic IRQ rate is fixed\n"); goto done; } - perror("RTC_IRQP_SET ioctl"); - exit(errno); + perror("RTC_IRQP_SET ioctl"); + exit(errno); } fprintf(stderr, "\n%ldHz:\t", tmp); @@ -417,27 +417,27 @@ test_PIE: /* Enable periodic interrupts */ retval = ioctl(fd, RTC_PIE_ON, 0); if (retval == -1) { - perror("RTC_PIE_ON ioctl"); - exit(errno); + perror("RTC_PIE_ON ioctl"); + exit(errno); } for (i=1; i<21; i++) { - /* This blocks */ - retval = read(fd, &data, sizeof(unsigned long)); - if (retval == -1) { - perror("read"); - exit(errno); - } - fprintf(stderr, " %d",i); - fflush(stderr); - irqcount++; + /* This blocks */ + retval = read(fd, &data, sizeof(unsigned long)); + if (retval == -1) { + perror("read"); + exit(errno); + } + fprintf(stderr, " %d",i); + fflush(stderr); + irqcount++; } /* Disable periodic interrupts */ retval = ioctl(fd, RTC_PIE_OFF, 0); if (retval == -1) { - perror("RTC_PIE_OFF ioctl"); - exit(errno); + perror("RTC_PIE_OFF ioctl"); + exit(errno); } } diff --git a/Documentation/s390/CommonIO b/Documentation/s390/CommonIO index 86320aa3fb0b..8fbc0a852870 100644 --- a/Documentation/s390/CommonIO +++ b/Documentation/s390/CommonIO @@ -4,6 +4,11 @@ S/390 common I/O-Layer - command line parameters, procfs and debugfs entries Command line parameters ----------------------- +* ccw_timeout_log + + Enable logging of debug information in case of ccw device timeouts. + + * cio_msg = yes | no Determines whether information on found devices and sensed device diff --git a/Documentation/s390/cds.txt b/Documentation/s390/cds.txt index 3081927cc2d6..c4b7b2bd369a 100644 --- a/Documentation/s390/cds.txt +++ b/Documentation/s390/cds.txt @@ -133,7 +133,7 @@ During its startup the Linux/390 system checks for peripheral devices. Each of those devices is uniquely defined by a so called subchannel by the ESA/390 channel subsystem. While the subchannel numbers are system generated, each subchannel also takes a user defined attribute, the so called device number. -Both subchannel number and device number cannot exceed 65535. During driverfs +Both subchannel number and device number cannot exceed 65535. During sysfs initialisation, the information about control unit type and device types that imply specific I/O commands (channel command words - CCWs) in order to operate the device are gathered. Device drivers can retrieve this set of hardware diff --git a/Documentation/scsi/00-INDEX b/Documentation/scsi/00-INDEX index aa1f7e927834..c2e18e109858 100644 --- a/Documentation/scsi/00-INDEX +++ b/Documentation/scsi/00-INDEX @@ -64,8 +64,6 @@ lpfc.txt - LPFC driver release notes megaraid.txt - Common Management Module, shared code handling ioctls for LSI drivers -ncr53c7xx.txt - - info on driver for NCR53c7xx based adapters ncr53c8xx.txt - info on driver for NCR53c8xx based adapters osst.txt diff --git a/Documentation/scsi/ChangeLog.megaraid_sas b/Documentation/scsi/ChangeLog.megaraid_sas index 5eb927544990..91c81db0ba71 100644 --- a/Documentation/scsi/ChangeLog.megaraid_sas +++ b/Documentation/scsi/ChangeLog.megaraid_sas @@ -1,3 +1,162 @@ +1 Release Date : Thur. Nov. 07 16:30:43 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.16 +3 Older Version : 00.00.03.15 + +1. Increased MFI_POLL_TIMEOUT_SECS to 60 seconds from 10. FW may take + a max of 60 seconds to respond to the INIT cmd. + +1 Release Date : Fri. Sep. 07 16:30:43 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.15 +3 Older Version : 00.00.03.14 + +1. Added module parameter "poll_mode_io" to support for "polling" + (reduced interrupt operation). In this mode, IO completion + interrupts are delayed. At the end of initiating IOs, the + driver schedules for cmd completion if there are pending cmds + to be completed. A timer-based interrupt has also been added + to prevent IO completion processing from being delayed + indefinitely in the case that no new IOs are initiated. + +1 Release Date : Fri. Sep. 07 16:30:43 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.14 +3 Older Version : 00.00.03.13 + +1. Setting the max_sectors_per_req based on max SGL supported by the + FW. Prior versions calculated this value from controller info + (max_sectors_1, max_sectors_2). For certain controllers/FW, + this was resulting in a value greater than max SGL supported + by the FW. Issue was first reported by users running LUKS+XFS + with megaraid_sas. Thanks to RB for providing the logs and + duplication steps that helped to get to the root cause of the + issue. 2. Increased MFI_POLL_TIMEOUT_SECS to 60 seconds from + 10. FW may take a max of 60 seconds to respond to the INIT + cmd. + +1 Release Date : Fri. June. 15 16:30:43 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.13 +3 Older Version : 00.00.03.12 + +1. Added the megasas_reset_timer routine to intercept cmd timeout and throttle io. + +On Fri, 2007-03-16 at 16:44 -0600, James Bottomley wrote: +It looks like megaraid_sas at least needs this to throttle its commands +> as they begin to time out. The code keeps the existing transport +> template use of eh_timed_out (and allows the transport to override the +> host if they both have this callback). +> +> James + +1 Release Date : Sat May. 12 16:30:43 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.12 +3 Older Version : 00.00.03.11 + +1. When MegaSAS driver receives reset call from OS, driver waits in reset +routine for max 3 minutes for all pending command completion. Now driver will +call completion routine every 5 seconds from the reset routine instead of +waiting for depending on cmd completion from isr path. + +1 Release Date : Mon Apr. 30 10:25:52 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.11 +3 Older Version : 00.00.03.09 + + 1. Memory Manager for IOCTL removed for 2.6 kernels. + pci_alloc_consistent replaced by dma_alloc_coherent. With this + change there is no need of memory manager in the driver code + + On Wed, 2007-02-07 at 13:30 -0800, Andrew Morton wrote: + > I suspect all this horror is due to stupidity in the DMA API. + > + > pci_alloc_consistent() just goes and assumes GFP_ATOMIC, whereas + > the caller (megasas_mgmt_fw_ioctl) would have been perfectly happy + > to use GFP_KERNEL. + > + > I bet this fixes it + + It does, but the DMA API was expanded to cope with this exact case, so + use dma_alloc_coherent() directly in the megaraid code instead. The dev + is just &pci_dev->dev. + + James <James.Bottomley@SteelEye.com> + + 3. SYNCHRONIZE_CACHE is not supported by FW and thus blocked by driver. + 4. Hibernation support added + 5. Performing diskdump while running IO in RHEL 4 was failing. Fixed. + +1 Release Date : Fri Feb. 09 14:36:28 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang + +2 Current Version : 00.00.03.09 +3 Older Version : 00.00.03.08 + +i. Under heavy IO mid-layer prints "DRIVER_TIMEOUT" errors + + The driver now waits for 10 seconds to elapse instead of 5 (as in + previous release) to resume IO. + +1 Release Date : Mon Feb. 05 11:35:24 PST 2007 - + (emaild-id:megaraidlinux@lsi.com) + Sumant Patro + Bo Yang +2 Current Version : 00.00.03.08 +3 Older Version : 00.00.03.07 + +i. Under heavy IO mid-layer prints "DRIVER_TIMEOUT" errors + + Fix: The driver is now throttling IO. + Checks added in megasas_queue_command to know if FW is able to + process commands within timeout period. If number of retries + is 2 or greater,the driver stops sending cmd to FW temporarily. IO is + resumed if pending cmd count reduces to 16 or 5 seconds has elapsed + from the time cmds were last sent to FW. + +ii. FW enables WCE bit in Mode Sense cmd for drives that are configured + as WriteBack. The OS may send "SYNCHRONIZE_CACHE" cmd when Logical + Disks are exposed with WCE=1. User is advised to enable Write Back + mode only when the controller has battery backup. At this time + Synhronize cache is not supported by the FW. Driver will short-cycle + the cmd and return sucess without sending down to FW. + +1 Release Date : Sun Jan. 14 11:21:32 PDT 2007 - + Sumant Patro <Sumant.Patro@lsil.com>/Bo Yang +2 Current Version : 00.00.03.07 +3 Older Version : 00.00.03.06 + +i. bios_param entry added in scsi_host_template that returns disk geometry + information. + +1 Release Date : Fri Oct 20 11:21:32 PDT 2006 - Sumant Patro <Sumant.Patro@lsil.com>/Bo Yang +2 Current Version : 00.00.03.06 +3 Older Version : 00.00.03.05 + +1. Added new memory management module to support the IOCTL memory allocation. For IOCTL we try to allocate from the memory pool created during driver initialization. If mem pool is empty then we allocate at run time. +2. Added check in megasas_queue_command and dpc/isr routine to see if we have already declared adapter dead + (hw_crit_error=1). If hw_crit_error==1, now we donot accept any processing of pending cmds/accept any cmd from OS 1 Release Date : Mon Oct 02 11:21:32 PDT 2006 - Sumant Patro <Sumant.Patro@lsil.com> 2 Current Version : 00.00.03.05 diff --git a/Documentation/scsi/aacraid.txt b/Documentation/scsi/aacraid.txt index a8257840695a..d16011a8618e 100644 --- a/Documentation/scsi/aacraid.txt +++ b/Documentation/scsi/aacraid.txt @@ -56,6 +56,10 @@ Supported Cards/Chipsets 9005:0285:9005:02d1 Adaptec 5405 (Voodoo40) 9005:0285:15d9:02d2 SMC AOC-USAS-S8i-LP 9005:0285:15d9:02d3 SMC AOC-USAS-S8iR-LP + 9005:0285:9005:02d4 Adaptec 2045 (Voodoo04 Lite) + 9005:0285:9005:02d5 Adaptec 2405 (Voodoo40 Lite) + 9005:0285:9005:02d6 Adaptec 2445 (Voodoo44 Lite) + 9005:0285:9005:02d7 Adaptec 2805 (Voodoo80 Lite) 1011:0046:9005:0364 Adaptec 5400S (Mustang) 9005:0287:9005:0800 Adaptec Themisto (Jupiter) 9005:0200:9005:0200 Adaptec Themisto (Jupiter) diff --git a/Documentation/scsi/hptiop.txt b/Documentation/scsi/hptiop.txt index d28a31247d4c..a6eb4add1be6 100644 --- a/Documentation/scsi/hptiop.txt +++ b/Documentation/scsi/hptiop.txt @@ -1,9 +1,9 @@ -HIGHPOINT ROCKETRAID 3xxx RAID DRIVER (hptiop) +HIGHPOINT ROCKETRAID 3xxx/4xxx ADAPTER DRIVER (hptiop) Controller Register Map ------------------------- -The controller IOP is accessed via PCI BAR0. +For Intel IOP based adapters, the controller IOP is accessed via PCI BAR0: BAR0 offset Register 0x10 Inbound Message Register 0 @@ -18,6 +18,24 @@ The controller IOP is accessed via PCI BAR0. 0x40 Inbound Queue Port 0x44 Outbound Queue Port +For Marvell IOP based adapters, the IOP is accessed via PCI BAR0 and BAR1: + + BAR0 offset Register + 0x20400 Inbound Doorbell Register + 0x20404 Inbound Interrupt Mask Register + 0x20408 Outbound Doorbell Register + 0x2040C Outbound Interrupt Mask Register + + BAR1 offset Register + 0x0 Inbound Queue Head Pointer + 0x4 Inbound Queue Tail Pointer + 0x8 Outbound Queue Head Pointer + 0xC Outbound Queue Tail Pointer + 0x10 Inbound Message Register + 0x14 Outbound Message Register + 0x40-0x1040 Inbound Queue + 0x1040-0x2040 Outbound Queue + I/O Request Workflow ---------------------- @@ -73,15 +91,9 @@ The driver exposes following sysfs attributes: driver-version R driver version string firmware-version R firmware version string -The driver registers char device "hptiop" to communicate with HighPoint RAID -management software. Its ioctl routine acts as a general binary interface -between the IOP firmware and HighPoint RAID management software. New management -functions can be implemented in application/firmware without modification -in driver code. - ----------------------------------------------------------------------------- -Copyright (C) 2006 HighPoint Technologies, Inc. All Rights Reserved. +Copyright (C) 2006-2007 HighPoint Technologies, Inc. All Rights Reserved. This file is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of diff --git a/Documentation/scsi/ncr53c7xx.txt b/Documentation/scsi/ncr53c7xx.txt deleted file mode 100644 index 91e9552d63e5..000000000000 --- a/Documentation/scsi/ncr53c7xx.txt +++ /dev/null @@ -1,40 +0,0 @@ -README for WarpEngine/A4000T/A4091 SCSI kernels. - -Use the following options to disable options in the SCSI driver. - -Using amiboot for example..... - -To disable Synchronous Negotiation.... - - amiboot -k kernel 53c7xx=nosync:0 - -To disable Disconnection.... - - amiboot -k kernel 53c7xx=nodisconnect:0 - -To disable certain SCSI devices... - - amiboot -k kernel 53c7xx=validids:0x3F - - this allows only device ID's 0,1,2,3,4 and 5 for linux to handle. - (this is a bitmasked field - i.e. each bit represents a SCSI ID) - -These commands work on a per controller basis and use the option 'next' to -move to the next controller in the system. - -e.g. - amiboot -k kernel 53c7xx=nodisconnect:0,next,nosync:0 - - this uses No Disconnection on the first controller and Asynchronous - SCSI on the second controller. - -Known Issues: - -Two devices are known not to function with the default settings of using -synchronous SCSI. These are the Archive Viper 150 Tape Drive and the -SyQuest SQ555 removeable hard drive. When using these devices on a controller -use the 'nosync:0' option. - -Please try these options and post any problems/successes to me. - -Alan Hourihane <alanh@fairlite.demon.co.uk> diff --git a/Documentation/smp.txt b/Documentation/smp.txt deleted file mode 100644 index 82fc50b6305d..000000000000 --- a/Documentation/smp.txt +++ /dev/null @@ -1,22 +0,0 @@ -To set up SMP - -Configure the kernel and answer Y to CONFIG_SMP. - -If you are using LILO, it is handy to have both SMP and non-SMP -kernel images on hand. Edit /etc/lilo.conf to create an entry -for another kernel image called "linux-smp" or something. - -The next time you compile the kernel, when running a SMP kernel, -edit linux/Makefile and change "MAKE=make" to "MAKE=make -jN" -(where N = number of CPU + 1, or if you have tons of memory/swap - you can just use "-j" without a number). Feel free to experiment -with this one. - -Of course you should time how long each build takes :-) -Example: - make config - time -v sh -c 'make clean install modules modules_install' - -If you are using some Compaq MP compliant machines you will need to set -the operating system in the BIOS settings to "Unixware" - don't ask me -why Compaqs don't work otherwise. diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 4b48c2e82c3c..e985cf5e0410 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -57,7 +57,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. - Default: 1 - For auto-loading more than one card, specify this option together with snd-card-X aliases. - + slots - Reserve the slot index for the given driver. + This option takes multiple strings. + See "Module Autoloading Support" section for details. Module snd-pcm-oss ------------------ @@ -148,13 +150,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on Analog Devices AD1816A/AD1815 ISA chips. - port - port # for AD1816A chip (PnP setup) - mpu_port - port # for MPU-401 UART (PnP setup) - fm_port - port # for OPL3 (PnP setup) - irq - IRQ # for AD1816A chip (PnP setup) - mpu_irq - IRQ # for MPU-401 UART (PnP setup) - dma1 - first DMA # for AD1816A chip (PnP setup) - dma2 - second DMA # for AD1816A chip (PnP setup) clockfreq - Clock frequency for AD1816A chip (default = 0, 33000Hz) This module supports multiple cards, autoprobe and PnP. @@ -201,14 +196,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on Avance Logic ALS100/ALS120 ISA chips. - port - port # for ALS100 (SB16) chip (PnP setup) - irq - IRQ # for ALS100 (SB16) chip (PnP setup) - dma8 - 8-bit DMA # for ALS100 (SB16) chip (PnP setup) - dma16 - 16-bit DMA # for ALS100 (SB16) chip (PnP setup) - mpu_port - port # for MPU-401 UART (PnP setup) - mpu_irq - IRQ # for MPU-401 (PnP setup) - fm_port - port # for OPL3 FM (PnP setup) - This module supports multiple cards, autoprobe and PnP. The power-management is supported. @@ -302,15 +289,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on Aztech System AZT2320 ISA chip (PnP only). - port - port # for AZT2320 chip (PnP setup) - wss_port - port # for WSS (PnP setup) - mpu_port - port # for MPU-401 UART (PnP setup) - fm_port - FM port # for AZT2320 chip (PnP setup) - irq - IRQ # for AZT2320 (WSS) chip (PnP setup) - mpu_irq - IRQ # for MPU-401 UART (PnP setup) - dma1 - 1st DMA # for AZT2320 (WSS) chip (PnP setup) - dma2 - 2nd DMA # for AZT2320 (WSS) chip (PnP setup) - This module supports multiple cards, PnP and autoprobe. The power-management is supported. @@ -350,6 +328,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on C-Media CMI8330 ISA chips. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + wssport - port # for CMI8330 chip (WSS) wssirq - IRQ # for CMI8330 chip (WSS) wssdma - first DMA # for CMI8330 chip (WSS) @@ -404,6 +386,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on CS4232/CS4232A ISA chips. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for CS4232 chip (PnP setup - 0x534) cport - control port # for CS4232 chip (PnP setup - 0x120,0x210,0xf00) mpu_port - port # for MPU-401 UART (PnP setup - 0x300), -1 = disable @@ -412,10 +398,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. mpu_irq - IRQ # for MPU-401 UART (9,11,12,15) dma1 - first DMA # for CS4232 chip (0,1,3) dma2 - second DMA # for Yamaha CS4232 chip (0,1,3), -1 = disable - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) This module supports multiple cards. This module does not support autoprobe - thus main port must be specified!!! Other ports are optional. + (if ISA PnP is not used) thus main port must be specified!!! Other ports are + optional. The power-management is supported. @@ -425,6 +411,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on CS4235/CS4236/CS4236B/CS4237B/ CS4238B/CS4239 ISA chips. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for CS4236 chip (PnP setup - 0x534) cport - control port # for CS4236 chip (PnP setup - 0x120,0x210,0xf00) mpu_port - port # for MPU-401 UART (PnP setup - 0x300), -1 = disable @@ -433,7 +423,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. mpu_irq - IRQ # for MPU-401 UART (9,11,12,15) dma1 - first DMA # for CS4236 chip (0,1,3) dma2 - second DMA # for CS4236 chip (0,1,3), -1 = disable - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) This module supports multiple cards. This module does not support autoprobe (if ISA PnP is not used) thus main port and control port must be @@ -503,13 +492,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for Diamond Technologies DT-019X / Avance Logic ALS-007 (PnP only) - port - Port # (PnP setup) - mpu_port - Port # for MPU-401 (PnP setup) - fm_port - Port # for FM OPL-3 (PnP setup) - irq - IRQ # (PnP setup) - mpu_irq - IRQ # for MPU-401 (PnP setup) - dma8 - DMA # (PnP setup) - This module supports multiple cards. This module is enabled only with ISA PnP support. @@ -607,10 +589,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on ESS ES968 chip (PnP only). - port - port # for ES968 (SB8) chip (PnP setup) - irq - IRQ # for ES968 (SB8) chip (PnP setup) - dma1 - DMA # for ES968 (SB8) chip (PnP setup) - This module supports multiple cards, PnP and autoprobe. The power-management is supported. @@ -633,13 +611,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for ESS AudioDrive ES-18xx sound cards. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for ES-18xx chip (0x220,0x240,0x260) mpu_port - port # for MPU-401 port (0x300,0x310,0x320,0x330), -1 = disable (default) fm_port - port # for FM (optional, not used) irq - IRQ # for ES-18xx chip (5,7,9,10) dma1 - first DMA # for ES-18xx chip (0,1,3) dma2 - first DMA # for ES-18xx chip (0,1,3) - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) This module supports multiple cards, ISA PnP and autoprobe (without MPU-401 port if native ISA PnP routines are not used). @@ -763,9 +744,12 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. VIA VT8251/VT8237A, SIS966, ULI M5461 + [Multiple options for each card instance] model - force the model name position_fix - Fix DMA pointer (0 = auto, 1 = none, 2 = POSBUF, 3 = FIFO size) probe_mask - Bitmask to probe codecs (default = -1, meaning all slots) + + [Single (global) options] single_cmd - Use single immediate commands to communicate with codecs (for debugging only) enable_msi - Enable Message Signaled Interrupt (MSI) (default = off) @@ -774,8 +758,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. power_save_controller - Reset HD-audio controller in power-saving mode (default = on) - This module supports one card and autoprobe. - + This module supports multiple cards and autoprobe. + Each codec may have a model table for different configurations. If your machine isn't listed there, the default (usually minimal) configuration is set up. You can pass "model=<name>" option to @@ -817,17 +801,23 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. will Will laptops (PB V7900) replacer Replacer 672V basic fixed pin assignment (old default model) + test for testing/debugging purpose, almost all controls can + adjusted. Appearing only when compiled with + $CONFIG_SND_DEBUG=y auto auto-config reading BIOS (default) ALC262 fujitsu Fujitsu Laptop hp-bpc HP xw4400/6400/8400/9400 laptops hp-bpc-d7000 HP BPC D7000 + hp-tc-t5735 HP Thin Client T5735 + hp-rp5700 HP RP5700 benq Benq ED8 benq-t31 Benq T31 hippo Hippo (ATI) with jack detection, Sony UX-90s hippo_1 Hippo (Benq) with jack detection sony-assamd Sony ASSAMD + ultra Samsung Q1 Ultra Vista model basic fixed pin assignment w/o SPDIF auto auto-config reading BIOS (default) @@ -835,6 +825,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack 3-stack model toshiba Toshiba A205 acer Acer laptops + dell Dell OEM laptops (Vostro 1200) + test for testing/debugging purpose, almost all controls can + adjusted. Appearing only when compiled with + $CONFIG_SND_DEBUG=y auto auto-config reading BIOS (default) ALC662 @@ -843,6 +837,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-6ch-dig 3-stack (6-channel) with SPDIF 6stack-dig 6-stack with SPDIF lenovo-101e Lenovo laptop + eeepc-p701 ASUS Eeepc P701 + eeepc-ep20 ASUS Eeepc EP20 auto auto-config reading BIOS (default) ALC882/885 @@ -877,6 +873,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. haier-w66 Haier W66 6stack-hp HP machines with 6stack (Nettle boards) 3stack-hp HP machines with 3stack (Lucknow, Samba boards) + 6stack-dell Dell machines with 6stack (Inspiron 530) + mitac Mitac 8252D auto auto-config reading BIOS (default) ALC861/660 @@ -928,6 +926,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. AD1984 basic default configuration thinkpad Lenovo Thinkpad T61/X61 + dell Dell T3400 AD1986A 6stack 6-jack, separate surrounds (default) @@ -947,7 +946,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. auto auto-config reading BIOS (default) Conexant 5045 - laptop Laptop config + laptop-hpsense Laptop with HP sense (old model laptop) + laptop-micsense Laptop with Mic sense (old model fujitsu) + laptop-hpmicsense Laptop with HP and Mic senses + benq Benq R55E test for testing/debugging purpose, almost all controls can be adjusted. Appearing only when compiled with $CONFIG_SND_DEBUG=y @@ -960,6 +962,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. can be adjusted. Appearing only when compiled with $CONFIG_SND_DEBUG=y + Conexant 5051 + laptop Basic Laptop config (default) + hp HP Spartan laptop + STAC9200 ref Reference board dell-d21 Dell (unknown) @@ -1091,6 +1097,15 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. See hdspm.txt for details. + Module snd-hifier + ----------------- + + Module for the MediaTek/TempoTec HiFier Fantasia sound card. + + This module supports autoprobe and multiple cards. + + Power management is _not_ supported. + Module snd-ice1712 ------------------ @@ -1156,11 +1171,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * Chaintech 9CJS * Chaintech AV-710 * Shuttle SN25P + * Onkyo SE-90PCI + * Onkyo SE-200PCI model - Use the given board model, one of the following: revo51, revo71, amp2000, prodigy71, prodigy71lt, prodigy192, aureon51, aureon71, universe, ap192, - k8x800, phase22, phase28, ms300, av710 + k8x800, phase22, phase28, ms300, av710, se200pci, + se90pci This module supports multiple cards and autoprobe. @@ -1257,15 +1275,19 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for Gravis UltraSound PnP, Dynasonic 3-D/Pro, STB Sound Rage 32 and other sound cards based on AMD InterWave (tm) chip. - port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) - irq - IRQ # for InterWave chip (3,5,9,11,12,15) - dma1 - DMA # for InterWave chip (0,1,3,5,6,7) - dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) midi - 1 = MIDI UART enable, 0 = MIDI UART disable (default) pcm_voices - reserved PCM voices for the synthesizer (default 2) effect - 1 = InterWave effects enable (default 0); requires 8 voices + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + + port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) + irq - IRQ # for InterWave chip (3,5,9,11,12,15) + dma1 - DMA # for InterWave chip (0,1,3,5,6,7) + dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) This module supports multiple cards, autoprobe and ISA PnP. @@ -1276,16 +1298,20 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. and other sound cards based on AMD InterWave (tm) chip with TEA6330T circuit for extended control of bass, treble and master volume. - port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) - port_tc - tone control (i2c bus) port # for TEA6330T chip (0x350,0x360,0x370,0x380) - irq - IRQ # for InterWave chip (3,5,9,11,12,15) - dma1 - DMA # for InterWave chip (0,1,3,5,6,7) - dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) midi - 1 = MIDI UART enable, 0 = MIDI UART disable (default) pcm_voices - reserved PCM voices for the synthesizer (default 2) effect - 1 = InterWave effects enable (default 0); requires 8 voices + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + + port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) + port_tc - tone control (i2c bus) port # for TEA6330T chip (0x350,0x360,0x370,0x380) + irq - IRQ # for InterWave chip (3,5,9,11,12,15) + dma1 - DMA # for InterWave chip (0,1,3,5,6,7) + dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) This module supports multiple cards, autoprobe and ISA PnP. @@ -1473,6 +1499,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for Yamaha OPL3-SA2/SA3 sound cards. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - control port # for OPL3-SA chip (0x370) sb_port - SB port # for OPL3-SA chip (0x220,0x240) wss_port - WSS port # for OPL3-SA chip (0x530,0xe80,0xf40,0x604) @@ -1481,7 +1511,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. irq - IRQ # for OPL3-SA chip (5,7,9,10) dma1 - first DMA # for Yamaha OPL3-SA chip (0,1,3) dma2 - second DMA # for Yamaha OPL3-SA chip (0,1,3), -1 = disable - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) This module supports multiple cards and ISA PnP. It does not support autoprobe (if ISA PnP is not used) thus all ports must be specified!!! @@ -1494,6 +1523,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on OPTi 82c92x and Analog Devices AD1848 chips. Module works with OAK Mozart cards as well. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) fm_port - port # for OPL3 device (0x388) @@ -1508,6 +1541,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on OPTi 82c92x and Crystal CS4231 chips. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) fm_port - port # for OPL3 device (0x388) @@ -1523,6 +1560,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for sound cards based on OPTi 82c93x chips. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) fm_port - port # for OPL3 device (0x388) @@ -1533,6 +1574,22 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. This module supports only one card, autoprobe and PnP. + Module snd-oxygen + ----------------- + + Module for sound cards based on the C-Media CMI8788 chip: + * Asound A-8788 + * AuzenTech X-Meridian + * Bgears b-Enspirer + * Club3D Theatron DTS + * HT-Omega Claro + * Razer Barracuda AC-1 + * Sondigo Inferno + + This module supports autoprobe and multiple cards. + + Power management is _not_ supported. + Module snd-pcxhr ---------------- @@ -1647,6 +1704,12 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. SoundBlaster AWE 32 (PnP), SoundBlaster AWE 64 PnP + mic_agc - Mic Auto-Gain-Control - 0 = disable, 1 = enable (default) + csp - ASP/CSP chip support - 0 = disable (default), 1 = enable + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + port - port # for SB DSP 4.x chip (0x220,0x240,0x260) mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disable awe_port - base port # for EMU8000 synthesizer (0x620,0x640,0x660) @@ -1654,9 +1717,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. irq - IRQ # for SB DSP 4.x chip (5,7,9,10) dma8 - 8-bit DMA # for SB DSP 4.x chip (0,1,3) dma16 - 16-bit DMA # for SB DSP 4.x chip (5,6,7) - mic_agc - Mic Auto-Gain-Control - 0 = disable, 1 = enable (default) - csp - ASP/CSP chip support - 0 = disable (default), 1 = enable - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) This module supports multiple cards, autoprobe and ISA PnP. @@ -1739,18 +1799,21 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for Turtle Beach Maui, Tropez and Tropez+ sound cards. + use_cs4232_midi - Use CS4232 MPU-401 interface + (inaccessibly located inside your computer) + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + with isapnp=0, the following options are available: + cs4232_pcm_port - Port # for CS4232 PCM interface. cs4232_pcm_irq - IRQ # for CS4232 PCM interface (5,7,9,11,12,15). cs4232_mpu_port - Port # for CS4232 MPU-401 interface. cs4232_mpu_irq - IRQ # for CS4232 MPU-401 interface (9,11,12,15). - use_cs4232_midi - Use CS4232 MPU-401 interface - (inaccessibly located inside your computer) ics2115_port - Port # for ICS2115 ics2115_irq - IRQ # for ICS2115 fm_port - FM OPL-3 Port # dma1 - DMA1 # for CS4232 PCM interface. dma2 - DMA2 # for CS4232 PCM interface. - isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) The below are options for wavefront_synth features: wf_raw - Assume that we need to boot the OS (default:no) @@ -1965,6 +2028,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. This module supports multiple cards. + Module snd-virtuoso + ------------------- + + Module for sound cards based on the Asus AV200 chip, i.e., + Xonar D2 and Xonar D2X. + + This module supports autoprobe and multiple cards. + + Power management is _not_ supported. + Module snd-vx222 ---------------- @@ -2135,6 +2208,23 @@ alias sound-slot-1 snd-ens1371 In this example, the interwave card is always loaded as the first card (index 0) and ens1371 as the second (index 1). +Alternative (and new) way to fixate the slot assignment is to use +"slots" option of snd module. In the case above, specify like the +following: + +options snd slots=snd-interwave,snd-ens1371 + +Then, the first slot (#0) is reserved for snd-interwave driver, and +the second (#1) for snd-ens1371. You can omit index option in each +driver if slots option is used (although you can still have them at +the same time as long as they don't conflict). + +The slots option is especially useful for avoiding the possible +hot-plugging and the resultant slot conflict. For example, in the +case above again, the first two slots are already reserved. If any +other driver (e.g. snd-usb-audio) is loaded before snd-interwave or +snd-ens1371, it will be assigned to the third or later slot. + ALSA PCM devices to OSS devices mapping ======================================= diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl index 2c3fc3cb3b6b..b03df4d4795c 100644 --- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl +++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl @@ -18,7 +18,7 @@ </affiliation> </author> - <date>September 10, 2007</date> + <date>Oct 15, 2007</date> <edition>0.3.7</edition> <abstract> @@ -67,7 +67,7 @@ This document describes how to write an <ulink url="http://www.alsa-project.org/"><citetitle> ALSA (Advanced Linux Sound Architecture)</citetitle></ulink> - driver. The document focuses mainly on the PCI soundcard. + driver. The document focuses mainly on PCI soundcards. In the case of other device types, the API might be different, too. However, at least the ALSA kernel API is consistent, and therefore it would be still a bit help for @@ -75,23 +75,23 @@ </para> <para> - The target of this document is ones who already have enough - skill of C language and have the basic knowledge of linux - kernel programming. This document doesn't explain the general - topics of linux kernel codes and doesn't cover the detail of - implementation of each low-level driver. It describes only how is + This document targets people who already have enough + C language skills and have basic linux kernel programming + knowledge. This document doesn't explain the general + topic of linux kernel coding and doesn't cover low-level + driver implementation details. It only describes the standard way to write a PCI sound driver on ALSA. </para> <para> - If you are already familiar with the older ALSA ver.0.5.x, you - can check the drivers such as <filename>es1938.c</filename> or - <filename>maestro3.c</filename> which have also almost the same + If you are already familiar with the older ALSA ver.0.5.x API, you + can check the drivers such as <filename>sound/pci/es1938.c</filename> or + <filename>sound/pci/maestro3.c</filename> which have also almost the same code-base in the ALSA 0.5.x tree, so you can compare the differences. </para> <para> - This document is still a draft version. Any feedbacks and + This document is still a draft version. Any feedback and corrections, please!! </para> </preface> @@ -106,7 +106,7 @@ <section id="file-tree-general"> <title>General</title> <para> - The ALSA drivers are provided in the two ways. + The ALSA drivers are provided in two ways. </para> <para> @@ -114,15 +114,14 @@ ALSA's ftp site, and another is the 2.6 (or later) Linux kernel tree. To synchronize both, the ALSA driver tree is split into two different trees: alsa-kernel and alsa-driver. The former - contains purely the source codes for the Linux 2.6 (or later) + contains purely the source code for the Linux 2.6 (or later) tree. This tree is designed only for compilation on 2.6 or later environment. The latter, alsa-driver, contains many subtle - files for compiling the ALSA driver on the outside of Linux - kernel like configure script, the wrapper functions for older, - 2.2 and 2.4 kernels, to adapt the latest kernel API, + files for compiling ALSA drivers outside of the Linux kernel tree, + wrapper functions for older 2.2 and 2.4 kernels, to adapt the latest kernel API, and additional drivers which are still in development or in tests. The drivers in alsa-driver tree will be moved to - alsa-kernel (eventually 2.6 kernel tree) once when they are + alsa-kernel (and eventually to the 2.6 kernel tree) when they are finished and confirmed to work fine. </para> @@ -168,7 +167,7 @@ <section id="file-tree-core-directory"> <title>core directory</title> <para> - This directory contains the middle layer, that is, the heart + This directory contains the middle layer which is the heart of ALSA drivers. In this directory, the native ALSA modules are stored. The sub-directories contain different modules and are dependent upon the kernel config. @@ -181,7 +180,7 @@ The codes for PCM and mixer OSS emulation modules are stored in this directory. The rawmidi OSS emulation is included in the ALSA rawmidi code since it's quite small. The sequencer - code is stored in core/seq/oss directory (see + code is stored in <filename>core/seq/oss</filename> directory (see <link linkend="file-tree-core-directory-seq-oss"><citetitle> below</citetitle></link>). </para> @@ -200,7 +199,7 @@ <section id="file-tree-core-directory-seq"> <title>core/seq</title> <para> - This and its sub-directories are for the ALSA + This directory and its sub-directories are for the ALSA sequencer. This directory contains the sequencer core and primary sequencer modules such like snd-seq-midi, snd-seq-virmidi, etc. They are compiled only when @@ -229,22 +228,22 @@ <title>include directory</title> <para> This is the place for the public header files of ALSA drivers, - which are to be exported to the user-space, or included by + which are to be exported to user-space, or included by several files at different directories. Basically, the private header files should not be placed in this directory, but you may - still find files there, due to historical reason :) + still find files there, due to historical reasons :) </para> </section> <section id="file-tree-drivers-directory"> <title>drivers directory</title> <para> - This directory contains the codes shared among different drivers - on the different architectures. They are hence supposed not to be + This directory contains code shared among different drivers + on different architectures. They are hence supposed not to be architecture-specific. For example, the dummy pcm driver and the serial MIDI driver are found in this directory. In the sub-directories, - there are the codes for components which are independent from + there is code for components which are independent from bus and cpu architectures. </para> @@ -271,7 +270,7 @@ <para> Although there is a standard i2c layer on Linux, ALSA has its - own i2c codes for some cards, because the soundcard needs only a + own i2c code for some cards, because the soundcard needs only a simple operation and the standard i2c API is too complicated for such a purpose. </para> @@ -292,28 +291,28 @@ <para> So far, there is only Emu8000/Emu10k1 synth driver under - synth/emux sub-directory. + the <filename>synth/emux</filename> sub-directory. </para> </section> <section id="file-tree-pci-directory"> <title>pci directory</title> <para> - This and its sub-directories hold the top-level card modules - for PCI soundcards and the codes specific to the PCI BUS. + This directory and its sub-directories hold the top-level card modules + for PCI soundcards and the code specific to the PCI BUS. </para> <para> - The drivers compiled from a single file is stored directly on - pci directory, while the drivers with several source files are - stored on its own sub-directory (e.g. emu10k1, ice1712). + The drivers compiled from a single file are stored directly + in the pci directory, while the drivers with several source files are + stored on their own sub-directory (e.g. emu10k1, ice1712). </para> </section> <section id="file-tree-isa-directory"> <title>isa directory</title> <para> - This and its sub-directories hold the top-level card modules + This directory and its sub-directories hold the top-level card modules for ISA soundcards. </para> </section> @@ -321,16 +320,16 @@ <section id="file-tree-arm-ppc-sparc-directories"> <title>arm, ppc, and sparc directories</title> <para> - These are for the top-level card modules which are - specific to each given architecture. + They are used for top-level card modules which are + specific to one of these architectures. </para> </section> <section id="file-tree-usb-directory"> <title>usb directory</title> <para> - This contains the USB-audio driver. On the latest version, the - USB MIDI driver is integrated together with usb-audio driver. + This directory contains the USB-audio driver. In the latest version, the + USB MIDI driver is integrated in the usb-audio driver. </para> </section> @@ -338,16 +337,17 @@ <title>pcmcia directory</title> <para> The PCMCIA, especially PCCard drivers will go here. CardBus - drivers will be on pci directory, because its API is identical - with the standard PCI cards. + drivers will be in the pci directory, because their API is identical + to that of standard PCI cards. </para> </section> <section id="file-tree-oss-directory"> <title>oss directory</title> <para> - The OSS/Lite source files are stored here on Linux 2.6 (or - later) tree. (In the ALSA driver tarball, it's empty, of course :) + The OSS/Lite source files are stored here in Linux 2.6 (or + later) tree. In the ALSA driver tarball, this directory is empty, + of course :) </para> </section> </chapter> @@ -362,7 +362,7 @@ <section id="basic-flow-outline"> <title>Outline</title> <para> - The minimum flow of PCI soundcard is like the following: + The minimum flow for PCI soundcards is as follows: <itemizedlist> <listitem><para>define the PCI ID table (see the section @@ -370,9 +370,13 @@ </citetitle></link>).</para></listitem> <listitem><para>create <function>probe()</function> callback.</para></listitem> <listitem><para>create <function>remove()</function> callback.</para></listitem> - <listitem><para>create pci_driver table which contains the three pointers above.</para></listitem> - <listitem><para>create <function>init()</function> function just calling <function>pci_register_driver()</function> to register the pci_driver table defined above.</para></listitem> - <listitem><para>create <function>exit()</function> function to call <function>pci_unregister_driver()</function> function.</para></listitem> + <listitem><para>create a <structname>pci_driver</structname> structure + containing the three pointers above.</para></listitem> + <listitem><para>create an <function>init()</function> function just calling + the <function>pci_register_driver()</function> to register the pci_driver table + defined above.</para></listitem> + <listitem><para>create an <function>exit()</function> function to call + the <function>pci_unregister_driver()</function> function.</para></listitem> </itemizedlist> </para> </section> @@ -382,15 +386,14 @@ <para> The code example is shown below. Some parts are kept unimplemented at this moment but will be filled in the - succeeding sections. The numbers in comment lines of - <function>snd_mychip_probe()</function> function are the - markers. + next sections. The numbers in the comment lines of the + <function>snd_mychip_probe()</function> function + refer to details explained in the following section. <example> - <title>Basic Flow for PCI Drivers Example</title> + <title>Basic Flow for PCI Drivers - Example</title> <programlisting> <![CDATA[ - #include <sound/driver.h> #include <linux/init.h> #include <linux/pci.h> #include <linux/slab.h> @@ -398,6 +401,7 @@ #include <sound/initval.h> /* module parameters (see "Module Parameters") */ + /* SNDRV_CARDS: maximum number of cards supported by this module */ static int index[SNDRV_CARDS] = SNDRV_DEFAULT_IDX; static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; @@ -405,13 +409,13 @@ /* definition of the chip-specific record */ struct mychip { struct snd_card *card; - /* rest of implementation will be in the section - * "PCI Resource Managements" + /* the rest of the implementation will be in section + * "PCI Resource Management" */ }; /* chip-specific destructor - * (see "PCI Resource Managements") + * (see "PCI Resource Management") */ static int snd_mychip_free(struct mychip *chip) { @@ -442,7 +446,7 @@ *rchip = NULL; /* check PCI availability here - * (see "PCI Resource Managements") + * (see "PCI Resource Management") */ .... @@ -454,7 +458,7 @@ chip->card = card; /* rest of initialization here; will be implemented - * later, see "PCI Resource Managements" + * later, see "PCI Resource Management" */ .... @@ -521,7 +525,7 @@ return 0; } - /* destructor -- see "Destructor" sub-section */ + /* destructor -- see the "Destructor" sub-section */ static void __devexit snd_mychip_remove(struct pci_dev *pci) { snd_card_free(pci_get_drvdata(pci)); @@ -536,16 +540,16 @@ <section id="basic-flow-constructor"> <title>Constructor</title> <para> - The real constructor of PCI drivers is probe callback. The - probe callback and other component-constructors which are called - from probe callback should be defined with - <parameter>__devinit</parameter> prefix. You - cannot use <parameter>__init</parameter> prefix for them, + The real constructor of PCI drivers is the <function>probe</function> callback. + The <function>probe</function> callback and other component-constructors which are called + from the <function>probe</function> callback should be defined with + the <parameter>__devinit</parameter> prefix. You + cannot use the <parameter>__init</parameter> prefix for them, because any PCI device could be a hotplug device. </para> <para> - In the probe callback, the following scheme is often used. + In the <function>probe</function> callback, the following scheme is often used. </para> <section id="basic-flow-constructor-device-index"> @@ -570,7 +574,7 @@ </para> <para> - At each time probe callback is called, check the + Each time the <function>probe</function> callback is called, check the availability of the device. If not available, simply increment the device index and returns. dev will be incremented also later (<link @@ -594,7 +598,7 @@ </para> <para> - The detail will be explained in the section + The details will be explained in the section <link linkend="card-management-card-instance"><citetitle> Management of Cards and Components</citetitle></link>. </para> @@ -619,9 +623,9 @@ </programlisting> </informalexample> - The detail will be explained in the section <link + The details will be explained in the section <link linkend="pci-resource"><citetitle>PCI Resource - Managements</citetitle></link>. + Management</citetitle></link>. </para> </section> @@ -640,7 +644,7 @@ </informalexample> The driver field holds the minimal ID string of the - chip. This is referred by alsa-lib's configurator, so keep it + chip. This is used by alsa-lib's configurator, so keep it simple but unique. Even the same driver can have different driver IDs to distinguish the functionality of each chip type. @@ -648,7 +652,7 @@ <para> The shortname field is a string shown as more verbose - name. The longname field contains the information which is + name. The longname field contains the information shown in <filename>/proc/asound/cards</filename>. </para> </section> @@ -703,7 +707,7 @@ </informalexample> In the above, the card record is stored. This pointer is - referred in the remove callback and power-management + used in the remove callback and power-management callbacks, too. </para> </section> @@ -746,7 +750,6 @@ <informalexample> <programlisting> <![CDATA[ - #include <sound/driver.h> #include <linux/init.h> #include <linux/pci.h> #include <linux/slab.h> @@ -757,22 +760,22 @@ </informalexample> where the last one is necessary only when module options are - defined in the source file. If the codes are split to several - files, the file without module options don't need them. + defined in the source file. If the code is split into several + files, the files without module options don't need them. </para> <para> - In addition to them, you'll need - <filename><linux/interrupt.h></filename> for the interrupt - handling, and <filename><asm/io.h></filename> for the i/o - access. If you use <function>mdelay()</function> or + In addition to these headers, you'll need + <filename><linux/interrupt.h></filename> for interrupt + handling, and <filename><asm/io.h></filename> for I/O + access. If you use the <function>mdelay()</function> or <function>udelay()</function> functions, you'll need to include - <filename><linux/delay.h></filename>, too. + <filename><linux/delay.h></filename> too. </para> <para> - The ALSA interfaces like PCM or control API are defined in other - header files as <filename><sound/xxx.h></filename>. + The ALSA interfaces like the PCM and control APIs are defined in other + <filename><sound/xxx.h></filename> header files. They have to be included after <filename><sound/core.h></filename>. </para> @@ -795,12 +798,12 @@ <para> A card record is the headquarters of the soundcard. It manages - the list of whole devices (components) on the soundcard, such as + the whole list of devices (components) on the soundcard, such as PCM, mixers, MIDI, synthesizer, and so on. Also, the card record holds the ID and the name strings of the card, manages the root of proc files, and controls the power-management states and hotplug disconnections. The component list on the card - record is used to manage the proper releases of resources at + record is used to manage the correct release of resources at destruction. </para> @@ -824,9 +827,8 @@ <constant>THIS_MODULE</constant>), and the size of extra-data space. The last argument is used to allocate card->private_data for the - chip-specific data. Note that this data - <emphasis>is</emphasis> allocated by - <function>snd_card_new()</function>. + chip-specific data. Note that these data + are allocated by <function>snd_card_new()</function>. </para> </section> @@ -834,10 +836,10 @@ <title>Components</title> <para> After the card is created, you can attach the components - (devices) to the card instance. On ALSA driver, a component is + (devices) to the card instance. In an ALSA driver, a component is represented as a struct <structname>snd_device</structname> object. A component can be a PCM instance, a control interface, a raw - MIDI interface, etc. Each of such instances has one component + MIDI interface, etc. Each such instance has one component entry. </para> @@ -859,7 +861,7 @@ (<constant>SNDRV_DEV_XXX</constant>), the data pointer, and the callback pointers (<parameter>&ops</parameter>). The device-level defines the type of components and the order of - registration and de-registration. For most of components, the + registration and de-registration. For most components, the device-level is already defined. For a user-defined component, you can use <constant>SNDRV_DEV_LOWLEVEL</constant>. </para> @@ -867,13 +869,13 @@ <para> This function itself doesn't allocate the data space. The data must be allocated manually beforehand, and its pointer is passed - as the argument. This pointer is used as the identifier - (<parameter>chip</parameter> in the above example) for the - instance. + as the argument. This pointer is used as the + (<parameter>chip</parameter> identifier in the above example) + for the instance. </para> <para> - Each ALSA pre-defined component such as ac97 or pcm calls + Each pre-defined ALSA component such as ac97 and pcm calls <function>snd_device_new()</function> inside its constructor. The destructor for each component is defined in the callback pointers. Hence, you don't need to take care of @@ -881,19 +883,19 @@ </para> <para> - If you would like to create your own component, you need to - set the destructor function to dev_free callback in - <parameter>ops</parameter>, so that it can be released - automatically via <function>snd_card_free()</function>. The - example will be shown later as an implementation of a - chip-specific data. + If you wish to create your own component, you need to + set the destructor function to the dev_free callback in + the <parameter>ops</parameter>, so that it can be released + automatically via <function>snd_card_free()</function>. + The next example will show an implementation of chip-specific + data. </para> </section> <section id="card-management-chip-specific"> <title>Chip-Specific Data</title> <para> - The chip-specific information, e.g. the i/o port address, its + Chip-specific information, e.g. the I/O port address, its resource pointer, or the irq number, is stored in the chip-specific record. @@ -909,13 +911,14 @@ </para> <para> - In general, there are two ways to allocate the chip record. + In general, there are two ways of allocating the chip record. </para> <section id="card-management-chip-specific-snd-card-new"> <title>1. Allocating via <function>snd_card_new()</function>.</title> <para> - As mentioned above, you can pass the extra-data-length to the 4th argument of <function>snd_card_new()</function>, i.e. + As mentioned above, you can pass the extra-data-length + to the 4th argument of <function>snd_card_new()</function>, i.e. <informalexample> <programlisting> @@ -925,7 +928,7 @@ </programlisting> </informalexample> - whether struct <structname>mychip</structname> is the type of the chip record. + struct <structname>mychip</structname> is the type of the chip record. </para> <para> @@ -1037,8 +1040,8 @@ <title>Registration and Release</title> <para> After all components are assigned, register the card instance - by calling <function>snd_card_register()</function>. The access - to the device files are enabled at this point. That is, before + by calling <function>snd_card_register()</function>. Access + to the device files is enabled at this point. That is, before <function>snd_card_register()</function> is called, the components are safely inaccessible from external side. If this call fails, exit the probe function after releasing the card via @@ -1047,7 +1050,7 @@ <para> For releasing the card instance, you can call simply - <function>snd_card_free()</function>. As already mentioned, all + <function>snd_card_free()</function>. As mentioned earlier, all components are released automatically by this call. </para> @@ -1055,7 +1058,7 @@ As further notes, the destructors (both <function>snd_mychip_dev_free</function> and <function>snd_mychip_free</function>) cannot be defined with - <parameter>__devexit</parameter> prefix, because they may be + the <parameter>__devexit</parameter> prefix, because they may be called from the constructor, too, at the false path. </para> @@ -1071,20 +1074,20 @@ <!-- ****************************************************** --> -<!-- PCI Resource Managements --> +<!-- PCI Resource Management --> <!-- ****************************************************** --> <chapter id="pci-resource"> - <title>PCI Resource Managements</title> + <title>PCI Resource Management</title> <section id="pci-resource-example"> <title>Full Code Example</title> <para> - In this section, we'll finish the chip-specific constructor, - destructor and PCI entries. The example code is shown first, + In this section, we'll complete the chip-specific constructor, + destructor and PCI entries. Example code is shown first, below. <example> - <title>PCI Resource Managements Example</title> + <title>PCI Resource Management Example</title> <programlisting> <![CDATA[ struct mychip { @@ -1103,7 +1106,7 @@ /* release the irq */ if (chip->irq >= 0) free_irq(chip->irq, chip); - /* release the i/o ports & memory */ + /* release the I/O ports & memory */ pci_release_regions(chip->pci); /* disable the PCI entry */ pci_disable_device(chip->pci); @@ -1196,13 +1199,13 @@ .remove = __devexit_p(snd_mychip_remove), }; - /* initialization of the module */ + /* module initialization */ static int __init alsa_card_mychip_init(void) { return pci_register_driver(&driver); } - /* clean up the module */ + /* module clean up */ static void __exit alsa_card_mychip_exit(void) { pci_unregister_driver(&driver); @@ -1228,10 +1231,10 @@ </para> <para> - In the case of PCI devices, you have to call at first - <function>pci_enable_device()</function> function before + In the case of PCI devices, you first have to call + the <function>pci_enable_device()</function> function before allocating resources. Also, you need to set the proper PCI DMA - mask to limit the accessed i/o range. In some cases, you might + mask to limit the accessed I/O range. In some cases, you might need to call <function>pci_set_master()</function> function, too. </para> @@ -1261,15 +1264,15 @@ <section id="pci-resource-resource-allocation"> <title>Resource Allocation</title> <para> - The allocation of I/O ports and irqs are done via standard kernel + The allocation of I/O ports and irqs is done via standard kernel functions. Unlike ALSA ver.0.5.x., there are no helpers for that. And these resources must be released in the destructor function (see below). Also, on ALSA 0.9.x, you don't need to - allocate (pseudo-)DMA for PCI like ALSA 0.5.x. + allocate (pseudo-)DMA for PCI like in ALSA 0.5.x. </para> <para> - Now assume that this PCI device has an I/O port with 8 bytes + Now assume that the PCI device has an I/O port with 8 bytes and an interrupt. Then struct <structname>mychip</structname> will have the following fields: @@ -1288,7 +1291,7 @@ </para> <para> - For an i/o port (and also a memory region), you need to have + For an I/O port (and also a memory region), you need to have the resource pointer for the standard resource management. For an irq, you have to keep only the irq number (integer). But you need to initialize this number as -1 before actual allocation, @@ -1299,7 +1302,7 @@ </para> <para> - The allocation of an i/o port is done like this: + The allocation of an I/O port is done like this: <informalexample> <programlisting> @@ -1318,12 +1321,12 @@ <para> <!-- obsolete --> - It will reserve the i/o port region of 8 bytes of the given + It will reserve the I/O port region of 8 bytes of the given PCI device. The returned value, chip->res_port, is allocated via <function>kmalloc()</function> by <function>request_region()</function>. The pointer must be - released via <function>kfree()</function>, but there is some - problem regarding this. This issue will be explained more below. + released via <function>kfree()</function>, but there is a + problem with this. This issue will be explained later. </para> <para> @@ -1351,8 +1354,8 @@ </para> <para> - On the PCI bus, the interrupts can be shared. Thus, - <constant>IRQF_SHARED</constant> is given as the interrupt flag of + On the PCI bus, interrupts can be shared. Thus, + <constant>IRQF_SHARED</constant> is used as the interrupt flag of <function>request_irq()</function>. </para> @@ -1364,7 +1367,7 @@ </para> <para> - I won't define the detail of the interrupt handler at this + I won't give details about the interrupt handler at this point, but at least its appearance can be explained now. The interrupt handler looks usually like the following: @@ -1386,11 +1389,11 @@ Now let's write the corresponding destructor for the resources above. The role of destructor is simple: disable the hardware (if already activated) and release the resources. So far, we - have no hardware part, so the disabling is not written here. + have no hardware part, so the disabling code is not written here. </para> <para> - For releasing the resources, <quote>check-and-release</quote> + To release the resources, the <quote>check-and-release</quote> method is a safer way. For the interrupt, do like this: <informalexample> @@ -1410,7 +1413,7 @@ <para> When you requested I/O ports or memory regions via <function>pci_request_region()</function> or - <function>pci_request_regions()</function> like this example, + <function>pci_request_regions()</function> like in this example, release the resource(s) using the corresponding function, <function>pci_release_region()</function> or <function>pci_release_regions()</function>. @@ -1429,7 +1432,7 @@ or <function>request_mem_region</function>, you can release it via <function>release_resource()</function>. Suppose that you keep the resource pointer returned from <function>request_region()</function> - in chip->res_port, the release procedure looks like below: + in chip->res_port, the release procedure looks like: <informalexample> <programlisting> @@ -1442,7 +1445,7 @@ <para> Don't forget to call <function>pci_disable_device()</function> - before all finished. + before the end. </para> <para> @@ -1459,14 +1462,14 @@ <para> Again, remember that you cannot - set <parameter>__devexit</parameter> prefix for this destructor. + use the <parameter>__devexit</parameter> prefix for this destructor. </para> <para> - We didn't implement the hardware-disabling part in the above. + We didn't implement the hardware disabling part in the above. If you need to do this, please note that the destructor may be called even before the initialization of the chip is completed. - It would be better to have a flag to skip the hardware-disabling + It would be better to have a flag to skip hardware disabling if the hardware was not initialized yet. </para> @@ -1475,14 +1478,14 @@ <function>snd_device_new()</function> with <constant>SNDRV_DEV_LOWLELVEL</constant> , its destructor is called at the last. That is, it is assured that all other - components like PCMs and controls have been already released. - You don't have to call stopping PCMs, etc. explicitly, but just - stop the hardware in the low-level. + components like PCMs and controls have already been released. + You don't have to stop PCMs, etc. explicitly, but just + call low-level hardware stopping. </para> <para> The management of a memory-mapped region is almost as same as - the management of an i/o port. You'll need three fields like + the management of an I/O port. You'll need three fields like the following: <informalexample> @@ -1561,8 +1564,8 @@ <section id="pci-resource-entries"> <title>PCI Entries</title> <para> - So far, so good. Let's finish the rest of missing PCI - stuffs. At first, we need a + So far, so good. Let's finish the missing PCI + stuff. At first, we need a <structname>pci_device_id</structname> table for this chipset. It's a table of PCI vendor/device ID number, and some masks. @@ -1588,13 +1591,13 @@ <para> The first and second fields of - <structname>pci_device_id</structname> struct are the vendor and - device IDs. If you have nothing special to filter the matching - devices, you can use the rest of fields like above. The last - field of <structname>pci_device_id</structname> struct is a + the <structname>pci_device_id</structname> structure are the vendor and + device IDs. If you have no reason to filter the matching + devices, you can leave the remaining fields as above. The last + field of the <structname>pci_device_id</structname> struct contains private data for this entry. You can specify any value here, for - example, to tell the type of different operations per each - device IDs. Such an example is found in intel8x0 driver. + example, to define specific operations for supported device IDs. + Such an example is found in the intel8x0 driver. </para> <para> @@ -1621,10 +1624,10 @@ <para> The <structfield>probe</structfield> and - <structfield>remove</structfield> functions are what we already - defined in - the previous sections. The <structfield>remove</structfield> should - be defined with + <structfield>remove</structfield> functions have already + been defined in the previous sections. + The <structfield>remove</structfield> function should + be defined with the <function>__devexit_p()</function> macro, so that it's not defined for built-in (and non-hot-pluggable) case. The <structfield>name</structfield> @@ -1665,8 +1668,7 @@ <para> Oh, one thing was forgotten. If you have no exported symbols, - you need to declare it on 2.2 or 2.4 kernels (on 2.6 kernels - it's not necessary, though). + you need to declare it in 2.2 or 2.4 kernels (it's not necessary in 2.6 kernels). <informalexample> <programlisting> @@ -1698,7 +1700,7 @@ <para> For accessing to the PCM layer, you need to include - <filename><sound/pcm.h></filename> above all. In addition, + <filename><sound/pcm.h></filename> first. In addition, <filename><sound/pcm_params.h></filename> might be needed if you access to some functions related with hw_param. </para> @@ -1707,21 +1709,21 @@ Each card device can have up to four pcm instances. A pcm instance corresponds to a pcm device file. The limitation of number of instances comes only from the available bit size of - the linux's device number. Once when 64bit device number is - used, we'll have more available pcm instances. + the Linux's device numbers. Once when 64bit device number is + used, we'll have more pcm instances available. </para> <para> A pcm instance consists of pcm playback and capture streams, and each pcm stream consists of one or more pcm substreams. Some - soundcard supports the multiple-playback function. For example, + soundcards support multiple playback functions. For example, emu10k1 has a PCM playback of 32 stereo substreams. In this case, at each open, a free substream is (usually) automatically chosen and opened. Meanwhile, when only one substream exists and it was - already opened, the succeeding open will result in the blocking - or the error with <constant>EAGAIN</constant> according to the - file open mode. But you don't have to know the detail in your - driver. The PCM middle layer will take all such jobs. + already opened, the successful open will either block + or error with <constant>EAGAIN</constant> according to the + file open mode. But you don't have to care about such details in your + driver. The PCM middle layer will take care of such work. </para> </section> @@ -1944,7 +1946,7 @@ <section id="pcm-interface-constructor"> <title>Constructor</title> <para> - A pcm instance is allocated by <function>snd_pcm_new()</function> + A pcm instance is allocated by the <function>snd_pcm_new()</function> function. It would be better to create a constructor for pcm, namely, @@ -1971,23 +1973,23 @@ </para> <para> - The <function>snd_pcm_new()</function> function takes the four + The <function>snd_pcm_new()</function> function takes four arguments. The first argument is the card pointer to which this pcm is assigned, and the second is the ID string. </para> <para> The third argument (<parameter>index</parameter>, 0 in the - above) is the index of this new pcm. It begins from zero. When - you will create more than one pcm instances, specify the + above) is the index of this new pcm. It begins from zero. If + you create more than one pcm instances, specify the different numbers in this argument. For example, <parameter>index</parameter> = 1 for the second PCM device. </para> <para> The fourth and fifth arguments are the number of substreams - for playback and capture, respectively. Here both 1 are given in - the above example. When no playback or no capture is available, + for playback and capture, respectively. Here 1 is used for + both arguments. When no playback or capture substreams are available, pass 0 to the corresponding argument. </para> @@ -2045,13 +2047,13 @@ </programlisting> </informalexample> - Each of callbacks is explained in the subsection + All the callbacks are described in the <link linkend="pcm-interface-operators"><citetitle> - Operators</citetitle></link>. + Operators</citetitle></link> subsection. </para> <para> - After setting the operators, most likely you'd like to + After setting the operators, you probably will want to pre-allocate the buffer. For the pre-allocation, simply call the following: @@ -2065,8 +2067,8 @@ </programlisting> </informalexample> - It will allocate up to 64kB buffer as default. The details of - buffer management will be described in the later section <link + It will allocate a buffer up to 64kB as default. + Buffer management details will be described in the later section <link linkend="buffer-and-memory"><citetitle>Buffer and Memory Management</citetitle></link>. </para> @@ -2095,13 +2097,13 @@ <para> The destructor for a pcm instance is not always necessary. Since the pcm device will be released by the middle - layer code automatically, you don't have to call destructor + layer code automatically, you don't have to call the destructor explicitly. </para> <para> - The destructor would be necessary when you created some - special records internally and need to release them. In such a + The destructor would be necessary if you created + special records internally and needed to release them. In such a case, set the destructor function to pcm->private_free: @@ -2141,16 +2143,15 @@ When the PCM substream is opened, a PCM runtime instance is allocated and assigned to the substream. This pointer is accessible via <constant>substream->runtime</constant>. - This runtime pointer holds the various information; it holds - the copy of hw_params and sw_params configurations, the buffer - pointers, mmap records, spinlocks, etc. Almost everything you - need for controlling the PCM can be found there. + This runtime pointer holds most information you need + to control the PCM: the copy of hw_params and sw_params configurations, the buffer + pointers, mmap records, spinlocks, etc. </para> <para> The definition of runtime instance is found in - <filename><sound/pcm.h></filename>. Here is the - copy from the file. + <filename><sound/pcm.h></filename>. Here are + the contents of this file: <informalexample> <programlisting> <![CDATA[ @@ -2185,7 +2186,6 @@ struct _snd_pcm_runtime { struct timespec tstamp_mode; /* mmap timestamp is updated */ unsigned int period_step; unsigned int sleep_min; /* min ticks to sleep */ - snd_pcm_uframes_t xfer_align; /* xfer size need to be a multiple */ snd_pcm_uframes_t start_threshold; snd_pcm_uframes_t stop_threshold; snd_pcm_uframes_t silence_threshold; /* Silence filling happens when @@ -2244,7 +2244,7 @@ struct _snd_pcm_runtime { <para> For the operators (callbacks) of each sound driver, most of these records are supposed to be read-only. Only the PCM - middle-layer changes / updates these info. The exceptions are + middle-layer changes / updates them. The exceptions are the hardware description (hw), interrupt callbacks (transfer_ack_xxx), DMA buffer information, and the private data. Besides, if you use the standard buffer allocation @@ -2285,7 +2285,7 @@ struct _snd_pcm_runtime { </para> <para> - Typically, you'll have a hardware descriptor like below: + Typically, you'll have a hardware descriptor as below: <informalexample> <programlisting> <![CDATA[ @@ -2320,10 +2320,10 @@ struct _snd_pcm_runtime { <constant>SNDRV_PCM_INFO_XXX</constant>. Here, at least, you have to specify whether the mmap is supported and which interleaved format is supported. - When the mmap is supported, add + When the is supported, add the <constant>SNDRV_PCM_INFO_MMAP</constant> flag here. When the hardware supports the interleaved or the non-interleaved - format, <constant>SNDRV_PCM_INFO_INTERLEAVED</constant> or + formats, <constant>SNDRV_PCM_INFO_INTERLEAVED</constant> or <constant>SNDRV_PCM_INFO_NONINTERLEAVED</constant> flag must be set, respectively. If both are supported, you can set both, too. @@ -2331,7 +2331,7 @@ struct _snd_pcm_runtime { <para> In the above example, <constant>MMAP_VALID</constant> and - <constant>BLOCK_TRANSFER</constant> are specified for OSS mmap + <constant>BLOCK_TRANSFER</constant> are specified for the OSS mmap mode. Usually both are set. Of course, <constant>MMAP_VALID</constant> is set only if the mmap is really supported. @@ -2345,11 +2345,11 @@ struct _snd_pcm_runtime { <quote>pause</quote> operation, while the <constant>RESUME</constant> bit means that the pcm supports the full <quote>suspend/resume</quote> operation. - If <constant>PAUSE</constant> flag is set, + If the <constant>PAUSE</constant> flag is set, the <structfield>trigger</structfield> callback below must handle the corresponding (pause push/release) commands. The suspend/resume trigger commands can be defined even without - <constant>RESUME</constant> flag. See <link + the <constant>RESUME</constant> flag. See <link linkend="power-management"><citetitle> Power Management</citetitle></link> section for details. </para> @@ -2382,7 +2382,7 @@ struct _snd_pcm_runtime { <constant>CONTINUOUS</constant> bit additionally. The pre-defined rate bits are provided only for typical rates. If your chip supports unconventional rates, you need to add - <constant>KNOT</constant> bit and set up the hardware + the <constant>KNOT</constant> bit and set up the hardware constraint manually (explained later). </para> </listitem> @@ -2390,8 +2390,8 @@ struct _snd_pcm_runtime { <listitem> <para> <structfield>rate_min</structfield> and - <structfield>rate_max</structfield> define the minimal and - maximal sample rate. This should correspond somehow to + <structfield>rate_max</structfield> define the minimum and + maximum sample rate. This should correspond somehow to <structfield>rates</structfield> bits. </para> </listitem> @@ -2400,7 +2400,7 @@ struct _snd_pcm_runtime { <para> <structfield>channel_min</structfield> and <structfield>channel_max</structfield> - define, as you might already expected, the minimal and maximal + define, as you might already expected, the minimum and maximum number of channels. </para> </listitem> @@ -2408,21 +2408,21 @@ struct _snd_pcm_runtime { <listitem> <para> <structfield>buffer_bytes_max</structfield> defines the - maximal buffer size in bytes. There is no + maximum buffer size in bytes. There is no <structfield>buffer_bytes_min</structfield> field, since - it can be calculated from the minimal period size and the - minimal number of periods. + it can be calculated from the minimum period size and the + minimum number of periods. Meanwhile, <structfield>period_bytes_min</structfield> and - define the minimal and maximal size of the period in bytes. + define the minimum and maximum size of the period in bytes. <structfield>periods_max</structfield> and - <structfield>periods_min</structfield> define the maximal and - minimal number of periods in the buffer. + <structfield>periods_min</structfield> define the maximum and + minimum number of periods in the buffer. </para> <para> - The <quote>period</quote> is a term, that corresponds to - fragment in the OSS world. The period defines the size at - which the PCM interrupt is generated. This size strongly + The <quote>period</quote> is a term that corresponds to + a fragment in the OSS world. The period defines the size at + which a PCM interrupt is generated. This size strongly depends on the hardware. Generally, the smaller period size will give you more interrupts, that is, more controls. @@ -2435,8 +2435,8 @@ struct _snd_pcm_runtime { <listitem> <para> There is also a field <structfield>fifo_size</structfield>. - This specifies the size of the hardware FIFO, but it's not - used currently in the driver nor in the alsa-lib. So, you + This specifies the size of the hardware FIFO, but currently it + is neither used in the driver nor in the alsa-lib. So, you can ignore this field. </para> </listitem> @@ -2450,7 +2450,7 @@ struct _snd_pcm_runtime { Ok, let's go back again to the PCM runtime records. The most frequently referred records in the runtime instance are the PCM configurations. - The PCM configurations are stored on runtime instance + The PCM configurations are stored in the runtime instance after the application sends <type>hw_params</type> data via alsa-lib. There are many fields copied from hw_params and sw_params structs. For example, @@ -2461,11 +2461,11 @@ struct _snd_pcm_runtime { <para> One thing to be noted is that the configured buffer and period - sizes are stored in <quote>frames</quote> in the runtime + sizes are stored in <quote>frames</quote> in the runtime. In the ALSA world, 1 frame = channels * samples-size. For conversion between frames and bytes, you can use the - helper functions, <function>frames_to_bytes()</function> and - <function>bytes_to_frames()</function>. + <function>frames_to_bytes()</function> and + <function>bytes_to_frames()</function> helper functions. <informalexample> <programlisting> <![CDATA[ @@ -2515,7 +2515,7 @@ struct _snd_pcm_runtime { <structfield>dma_area</structfield> is necessary when the buffer is mmapped. If your driver doesn't support mmap, this field is not necessary. <structfield>dma_addr</structfield> - is also not mandatory. You can use + is also optional. You can use <structfield>dma_private</structfield> as you like, too. </para> </section> @@ -2524,14 +2524,14 @@ struct _snd_pcm_runtime { <title>Running Status</title> <para> The running status can be referred via <constant>runtime->status</constant>. - This is the pointer to struct <structname>snd_pcm_mmap_status</structname> + This is the pointer to the struct <structname>snd_pcm_mmap_status</structname> record. For example, you can get the current DMA hardware pointer via <constant>runtime->status->hw_ptr</constant>. </para> <para> The DMA application pointer can be referred via - <constant>runtime->control</constant>, which points + <constant>runtime->control</constant>, which points to the struct <structname>snd_pcm_mmap_control</structname> record. However, accessing directly to this value is not recommended. </para> @@ -2542,14 +2542,14 @@ struct _snd_pcm_runtime { <para> You can allocate a record for the substream and store it in <constant>runtime->private_data</constant>. Usually, this - done in + is done in <link linkend="pcm-interface-operators-open-callback"><citetitle> the open callback</citetitle></link>. Don't mix this with <constant>pcm->private_data</constant>. - The <constant>pcm->private_data</constant> usually points the + The <constant>pcm->private_data</constant> usually points to the chip instance assigned statically at the creation of PCM, while the - <constant>runtime->private_data</constant> points a dynamic - data created at the PCM open callback. + <constant>runtime->private_data</constant> points to a dynamic + data structure created at the PCM open callback. <informalexample> <programlisting> @@ -2579,7 +2579,7 @@ struct _snd_pcm_runtime { <para> The field <structfield>transfer_ack_begin</structfield> and <structfield>transfer_ack_end</structfield> are called at - the beginning and the end of + the beginning and at the end of <function>snd_pcm_period_elapsed()</function>, respectively. </para> </section> @@ -2589,17 +2589,18 @@ struct _snd_pcm_runtime { <section id="pcm-interface-operators"> <title>Operators</title> <para> - OK, now let me explain the detail of each pcm callback + OK, now let me give details about each pcm callback (<parameter>ops</parameter>). In general, every callback must - return 0 if successful, or a negative number with the error - number such as <constant>-EINVAL</constant> at any - error. + return 0 if successful, or a negative error number + such as <constant>-EINVAL</constant>. To choose an appropriate + error number, it is advised to check what value other parts of + the kernel return when the same kind of request fails. </para> <para> The callback function takes at least the argument with - <structname>snd_pcm_substream</structname> pointer. For retrieving the - chip record from the given substream instance, you can use the + <structname>snd_pcm_substream</structname> pointer. To retrieve + the chip record from the given substream instance, you can use the following macro. <informalexample> @@ -2616,7 +2617,7 @@ struct _snd_pcm_runtime { The macro reads <constant>substream->private_data</constant>, which is a copy of <constant>pcm->private_data</constant>. You can override the former if you need to assign different data - records per PCM substream. For example, cmi8330 driver assigns + records per PCM substream. For example, the cmi8330 driver assigns different private_data for playback and capture directions, because it uses two different codecs (SB- and AD-compatible) for different directions. @@ -2709,7 +2710,7 @@ struct _snd_pcm_runtime { <section id="pcm-interface-operators-ioctl-callback"> <title>ioctl callback</title> <para> - This is used for any special action to pcm ioctls. But + This is used for any special call to pcm ioctls. But usually you can pass a generic ioctl callback, <function>snd_pcm_lib_ioctl</function>. </para> @@ -2726,9 +2727,6 @@ struct _snd_pcm_runtime { ]]> </programlisting> </informalexample> - - This and <structfield>hw_free</structfield> callbacks exist - only on ALSA 0.9.x. </para> <para> @@ -2740,13 +2738,13 @@ struct _snd_pcm_runtime { </para> <para> - Many hardware set-up should be done in this callback, + Many hardware setups should be done in this callback, including the allocation of buffers. </para> <para> Parameters to be initialized are retrieved by - <function>params_xxx()</function> macros. For allocating a + <function>params_xxx()</function> macros. To allocate buffer, you can call a helper function, <informalexample> @@ -2772,8 +2770,8 @@ struct _snd_pcm_runtime { </para> <para> - Thus, you need to take care not to allocate the same buffers - many times, which will lead to memory leak! Calling the + Thus, you need to be careful not to allocate the same buffers + many times, which will lead to memory leaks! Calling the helper function above many times is OK. It will release the previous buffer automatically when it was already allocated. </para> @@ -2782,7 +2780,7 @@ struct _snd_pcm_runtime { Another note is that this callback is non-atomic (schedulable). This is important, because the <structfield>trigger</structfield> callback - is atomic (non-schedulable). That is, mutex or any + is atomic (non-schedulable). That is, mutexes or any schedule-related functions are not available in <structfield>trigger</structfield> callback. Please see the subsection @@ -2843,15 +2841,15 @@ struct _snd_pcm_runtime { <quote>prepared</quote>. You can set the format type, sample rate, etc. here. The difference from <structfield>hw_params</structfield> is that the - <structfield>prepare</structfield> callback will be called at each + <structfield>prepare</structfield> callback will be called each time <function>snd_pcm_prepare()</function> is called, i.e. when - recovered after underruns, etc. + recovering after underruns, etc. </para> <para> - Note that this callback became non-atomic since the recent version. - You can use schedule-related functions safely in this callback now. + Note that this callback is now non-atomic. + You can use schedule-related functions safely in this callback. </para> <para> @@ -2871,7 +2869,7 @@ struct _snd_pcm_runtime { <para> Be careful that this callback will be called many times at - each set up, too. + each setup, too. </para> </section> @@ -2893,7 +2891,7 @@ struct _snd_pcm_runtime { Which action is specified in the second argument, <constant>SNDRV_PCM_TRIGGER_XXX</constant> in <filename><sound/pcm.h></filename>. At least, - <constant>START</constant> and <constant>STOP</constant> + the <constant>START</constant> and <constant>STOP</constant> commands must be defined in this callback. <informalexample> @@ -2915,8 +2913,8 @@ struct _snd_pcm_runtime { </para> <para> - When the pcm supports the pause operation (given in info - field of the hardware table), <constant>PAUSE_PUSE</constant> + When the pcm supports the pause operation (given in the info + field of the hardware table), the <constant>PAUSE_PUSE</constant> and <constant>PAUSE_RELEASE</constant> commands must be handled here, too. The former is the command to pause the pcm, and the latter to restart the pcm again. @@ -2925,21 +2923,21 @@ struct _snd_pcm_runtime { <para> When the pcm supports the suspend/resume operation, regardless of full or partial suspend/resume support, - <constant>SUSPEND</constant> and <constant>RESUME</constant> + the <constant>SUSPEND</constant> and <constant>RESUME</constant> commands must be handled, too. These commands are issued when the power-management status is changed. Obviously, the <constant>SUSPEND</constant> and - <constant>RESUME</constant> - do suspend and resume of the pcm substream, and usually, they - are identical with <constant>STOP</constant> and + <constant>RESUME</constant> commands + suspend and resume the pcm substream, and usually, they + are identical to the <constant>STOP</constant> and <constant>START</constant> commands, respectively. - See <link linkend="power-management"><citetitle> + See the <link linkend="power-management"><citetitle> Power Management</citetitle></link> section for details. </para> <para> As mentioned, this callback is atomic. You cannot call - the function going to sleep. + functions which may sleep. The trigger callback should be as minimal as possible, just really triggering the DMA. The other stuff should be initialized hw_params and prepare callbacks properly @@ -2960,8 +2958,8 @@ struct _snd_pcm_runtime { This callback is called when the PCM middle layer inquires the current hardware position on the buffer. The position must - be returned in frames (which was in bytes on ALSA 0.5.x), - ranged from 0 to buffer_size - 1. + be returned in frames, + ranging from 0 to buffer_size - 1. </para> <para> @@ -2983,7 +2981,7 @@ struct _snd_pcm_runtime { <para> These callbacks are not mandatory, and can be omitted in most cases. These callbacks are used when the hardware buffer - cannot be on the normal memory space. Some chips have their + cannot be in the normal memory space. Some chips have their own buffer on the hardware which is not mappable. In such a case, you have to transfer the data manually from the memory buffer to the hardware buffer. Or, if the buffer is @@ -3018,8 +3016,8 @@ struct _snd_pcm_runtime { <title>page callback</title> <para> - This callback is also not mandatory. This callback is used - mainly for the non-contiguous buffer. The mmap calls this + This callback is optional too. This callback is used + mainly for non-contiguous buffers. The mmap calls this callback to get the page address. Some examples will be explained in the later section <link linkend="buffer-and-memory"><citetitle>Buffer and Memory @@ -3035,7 +3033,7 @@ struct _snd_pcm_runtime { role of PCM interrupt handler in the sound driver is to update the buffer position and to tell the PCM middle layer when the buffer position goes across the prescribed period size. To - inform this, call <function>snd_pcm_period_elapsed()</function> + inform this, call the <function>snd_pcm_period_elapsed()</function> function. </para> @@ -3072,7 +3070,7 @@ struct _snd_pcm_runtime { </para> <para> - A typical coding would be like: + Typical code would be like: <example> <title>Interrupt Handler Case #1</title> @@ -3101,21 +3099,21 @@ struct _snd_pcm_runtime { </section> <section id="pcm-interface-interrupt-handler-timer"> - <title>High-frequent timer interrupts</title> + <title>High frequency timer interrupts</title> <para> - This is the case when the hardware doesn't generate interrupts - at the period boundary but do timer-interrupts at the fixed + This happense when the hardware doesn't generate interrupts + at the period boundary but issues timer interrupts at a fixed timer rate (e.g. es1968 or ymfpci drivers). In this case, you need to check the current hardware - position and accumulates the processed sample length at each - interrupt. When the accumulated size overcomes the period + position and accumulate the processed sample length at each + interrupt. When the accumulated size exceeds the period size, call <function>snd_pcm_period_elapsed()</function> and reset the accumulator. </para> <para> - A typical coding would be like the following. + Typical code would be like the following. <example> <title>Interrupt Handler Case #2</title> @@ -3178,32 +3176,33 @@ struct _snd_pcm_runtime { <section id="pcm-interface-atomicity"> <title>Atomicity</title> <para> - One of the most important (and thus difficult to debug) problem - on the kernel programming is the race condition. - On linux kernel, usually it's solved via spin-locks or - semaphores. In general, if the race condition may - happen in the interrupt handler, it's handled as atomic, and you - have to use spinlock for protecting the critical session. If it - never happens in the interrupt and it may take relatively long - time, you should use semaphore. + One of the most important (and thus difficult to debug) problems + in kernel programming are race conditions. + In the Linux kernel, they are usually avoided via spin-locks, mutexes + or semaphores. In general, if a race condition can happen + in an interrupt handler, it has to be managed atomically, and you + have to use a spinlock to protect the critical session. If the + critical section is not in interrupt handler code and + if taking a relatively long time to execute is acceptable, you + should use mutexes or semaphores instead. </para> <para> As already seen, some pcm callbacks are atomic and some are - not. For example, <parameter>hw_params</parameter> callback is + not. For example, the <parameter>hw_params</parameter> callback is non-atomic, while <parameter>trigger</parameter> callback is atomic. This means, the latter is called already in a spinlock held by the PCM middle layer. Please take this atomicity into - account when you use a spinlock or a semaphore in the callbacks. + account when you choose a locking scheme in the callbacks. </para> <para> In the atomic callbacks, you cannot use functions which may call <function>schedule</function> or go to - <function>sleep</function>. The semaphore and mutex do sleep, + <function>sleep</function>. Semaphores and mutexes can sleep, and hence they cannot be used inside the atomic callbacks (e.g. <parameter>trigger</parameter> callback). - For taking a certain delay in such a callback, please use + To implement some delay in such a callback, please use <function>udelay()</function> or <function>mdelay()</function>. </para> @@ -3257,7 +3256,7 @@ struct _snd_pcm_runtime { <para> There are many different constraints. - Look in <filename>sound/pcm.h</filename> for a complete list. + Look at <filename>sound/pcm.h</filename> for a complete list. You can even define your own constraint rules. For example, let's suppose my_chip can manage a substream of 1 channel if and only if the format is S16_LE, otherwise it supports any format @@ -3346,7 +3345,7 @@ struct _snd_pcm_runtime { </para> <para> - I won't explain more details here, rather I + I won't give more details here, rather I would like to say, <quote>Luke, use the source.</quote> </para> </section> @@ -3364,10 +3363,9 @@ struct _snd_pcm_runtime { <title>General</title> <para> The control interface is used widely for many switches, - sliders, etc. which are accessed from the user-space. Its most - important use is the mixer interface. In other words, on ALSA - 0.9.x, all the mixer stuff is implemented on the control kernel - API (while there was an independent mixer kernel API on 0.5.x). + sliders, etc. which are accessed from user-space. Its most + important use is the mixer interface. In other words, since ALSA + 0.9.x, all the mixer stuff is implemented on the control kernel API. </para> <para> @@ -3379,14 +3377,15 @@ struct _snd_pcm_runtime { <para> The control API is defined in <filename><sound/control.h></filename>. - Include this file if you add your own controls. + Include this file if you want to add your own controls. </para> </section> <section id="control-interface-definition"> <title>Definition of Controls</title> <para> - For creating a new control, you need to define the three + To create a new control, you need to define the + following three callbacks: <structfield>info</structfield>, <structfield>get</structfield> and <structfield>put</structfield>. Then, define a @@ -3414,13 +3413,13 @@ struct _snd_pcm_runtime { <para> Most likely the control is created via <function>snd_ctl_new1()</function>, and in such a case, you can - add <parameter>__devinitdata</parameter> prefix to the - definition like above. + add the <parameter>__devinitdata</parameter> prefix to the + definition as above. </para> <para> - The <structfield>iface</structfield> field specifies the type of - the control, <constant>SNDRV_CTL_ELEM_IFACE_XXX</constant>, which + The <structfield>iface</structfield> field specifies the control + type, <constant>SNDRV_CTL_ELEM_IFACE_XXX</constant>, which is usually <constant>MIXER</constant>. Use <constant>CARD</constant> for global controls that are not logically part of the mixer. @@ -3435,12 +3434,11 @@ struct _snd_pcm_runtime { <para> The <structfield>name</structfield> is the name identifier - string. On ALSA 0.9.x, the control name is very important, + string. Since ALSA 0.9.x, the control name is very important, because its role is classified from its name. There are pre-defined standard control names. The details are described in - the subsection - <link linkend="control-interface-control-names"><citetitle> - Control Names</citetitle></link>. + the <link linkend="control-interface-control-names"><citetitle> + Control Names</citetitle></link> subsection. </para> <para> @@ -3456,15 +3454,15 @@ struct _snd_pcm_runtime { The <structfield>access</structfield> field contains the access type of this control. Give the combination of bit masks, <constant>SNDRV_CTL_ELEM_ACCESS_XXX</constant>, there. - The detailed will be explained in the subsection - <link linkend="control-interface-access-flags"><citetitle> - Access Flags</citetitle></link>. + The details will be explained in + the <link linkend="control-interface-access-flags"><citetitle> + Access Flags</citetitle></link> subsection. </para> <para> The <structfield>private_value</structfield> field contains an arbitrary long integer value for this record. When using - generic <structfield>info</structfield>, + the generic <structfield>info</structfield>, <structfield>get</structfield> and <structfield>put</structfield> callbacks, you can pass a value through this field. If several small numbers are necessary, you can @@ -3489,7 +3487,7 @@ struct _snd_pcm_runtime { <section id="control-interface-control-names"> <title>Control Names</title> <para> - There are some standards for defining the control names. A + There are some standards to define the control names. A control is usually defined from the three parts as <quote>SOURCE DIRECTION FUNCTION</quote>. </para> @@ -3497,7 +3495,7 @@ struct _snd_pcm_runtime { <para> The first, <constant>SOURCE</constant>, specifies the source of the control, and is a string such as <quote>Master</quote>, - <quote>PCM</quote>, <quote>CD</quote> or + <quote>PCM</quote>, <quote>CD</quote> and <quote>Line</quote>. There are many pre-defined sources. </para> @@ -3575,22 +3573,22 @@ struct _snd_pcm_runtime { <title>Access Flags</title> <para> - The access flag is the bit-flags which specifies the access type + The access flag is the bitmask which specifies the access type of the given control. The default access type is <constant>SNDRV_CTL_ELEM_ACCESS_READWRITE</constant>, which means both read and write are allowed to this control. When the access flag is omitted (i.e. = 0), it is - regarded as <constant>READWRITE</constant> access as default. + considered as <constant>READWRITE</constant> access as default. </para> <para> When the control is read-only, pass <constant>SNDRV_CTL_ELEM_ACCESS_READ</constant> instead. In this case, you don't have to define - <structfield>put</structfield> callback. + the <structfield>put</structfield> callback. Similarly, when the control is write-only (although it's a rare - case), you can use <constant>WRITE</constant> flag instead, and - you don't need <structfield>get</structfield> callback. + case), you can use the <constant>WRITE</constant> flag instead, and + you don't need the <structfield>get</structfield> callback. </para> <para> @@ -3598,15 +3596,15 @@ struct _snd_pcm_runtime { <constant>VOLATILE</constant> flag should be given. This means that the control may be changed without <link linkend="control-interface-change-notification"><citetitle> - notification</citetitle></link>. Applications should poll such + notification</citetitle></link>. Applications should poll such a control constantly. </para> <para> When the control is inactive, set - <constant>INACTIVE</constant> flag, too. + the <constant>INACTIVE</constant> flag, too. There are <constant>LOCK</constant> and - <constant>OWNER</constant> flags for changing the write + <constant>OWNER</constant> flags to change the write permissions. </para> @@ -3619,10 +3617,10 @@ struct _snd_pcm_runtime { <title>info callback</title> <para> The <structfield>info</structfield> callback is used to get - the detailed information of this control. This must store the + detailed information on this control. This must store the values of the given struct <structname>snd_ctl_elem_info</structname> object. For example, for a boolean control with a single - element will be: + element: <example> <title>Example of info callback</title> @@ -3653,7 +3651,7 @@ struct _snd_pcm_runtime { volume would have count = 2. The <structfield>value</structfield> field is a union, and the values stored are depending on the type. The boolean and - integer are identical. + integer types are identical. </para> <para> @@ -3684,7 +3682,7 @@ struct _snd_pcm_runtime { </para> <para> - Some common info callbacks are prepared for easy use: + Some common info callbacks are available for your convenience: <function>snd_ctl_boolean_mono_info()</function> and <function>snd_ctl_boolean_stereo_info()</function>. Obviously, the former is an info callback for a mono channel @@ -3699,7 +3697,7 @@ struct _snd_pcm_runtime { <para> This callback is used to read the current value of the - control and to return to the user-space. + control and to return to user-space. </para> <para> @@ -3722,11 +3720,11 @@ struct _snd_pcm_runtime { </para> <para> - The <structfield>value</structfield> field is depending on - the type of control as well as on info callback. For example, + The <structfield>value</structfield> field depends on + the type of control as well as on the info callback. For example, the sb driver uses this field to store the register offset, the bit-shift and the bit-mask. The - <structfield>private_value</structfield> is set like + <structfield>private_value</structfield> field is set as follows: <informalexample> <programlisting> <![CDATA[ @@ -3752,7 +3750,8 @@ struct _snd_pcm_runtime { </para> <para> - In <structfield>get</structfield> callback, you have to fill all the elements if the + In the <structfield>get</structfield> callback, + you have to fill all the elements if the control has more than one elements, i.e. <structfield>count</structfield> > 1. In the example above, we filled only one element @@ -3765,7 +3764,7 @@ struct _snd_pcm_runtime { <title>put callback</title> <para> - This callback is used to write a value from the user-space. + This callback is used to write a value from user-space. </para> <para> @@ -3799,7 +3798,7 @@ struct _snd_pcm_runtime { </para> <para> - Like <structfield>get</structfield> callback, + As in the <structfield>get</structfield> callback, when the control has more than one elements, all elements must be evaluated in this callback, too. </para> @@ -3817,7 +3816,7 @@ struct _snd_pcm_runtime { <title>Constructor</title> <para> When everything is ready, finally we can create a new - control. For creating a control, there are two functions to be + control. To create a control, there are two functions to be called, <function>snd_ctl_new1()</function> and <function>snd_ctl_add()</function>. </para> @@ -3839,14 +3838,14 @@ struct _snd_pcm_runtime { struct <structname>snd_kcontrol_new</structname> object defined above, and chip is the object pointer to be passed to kcontrol->private_data - which can be referred in callbacks. + which can be referred to in callbacks. </para> <para> <function>snd_ctl_new1()</function> allocates a new <structname>snd_kcontrol</structname> instance (that's why the definition of <parameter>my_control</parameter> can be with - <parameter>__devinitdata</parameter> + the <parameter>__devinitdata</parameter> prefix), and <function>snd_ctl_add</function> assigns the given control component to the card. </para> @@ -3941,7 +3940,7 @@ struct _snd_pcm_runtime { <title>General</title> <para> The ALSA AC97 codec layer is a well-defined one, and you don't - have to write many codes to control it. Only low-level control + have to write much code to control it. Only low-level control routines are necessary. The AC97 codec API is defined in <filename><sound/ac97_codec.h></filename>. </para> @@ -4004,7 +4003,7 @@ struct _snd_pcm_runtime { <section id="api-ac97-constructor"> <title>Constructor</title> <para> - For creating an ac97 instance, first call <function>snd_ac97_bus</function> + To create an ac97 instance, first call <function>snd_ac97_bus</function> with an <type>ac97_bus_ops_t</type> record with callback functions. <informalexample> @@ -4042,12 +4041,12 @@ struct _snd_pcm_runtime { </programlisting> </informalexample> - where chip->ac97 is the pointer of a newly created + where chip->ac97 is a pointer to a newly created <type>ac97_t</type> instance. In this case, the chip pointer is set as the private data, so that the read/write callback functions can refer to this chip instance. This instance is not necessarily stored in the chip - record. When you need to change the register values from the + record. If you need to change the register values from the driver, or need the suspend/resume of ac97 codecs, keep this pointer to pass to the corresponding functions. </para> @@ -4098,7 +4097,7 @@ struct _snd_pcm_runtime { </para> <para> - These callbacks are non-atomic like the callbacks of control API. + These callbacks are non-atomic like the control API callbacks. </para> <para> @@ -4110,14 +4109,14 @@ struct _snd_pcm_runtime { <para> The <structfield>reset</structfield> callback is used to reset - the codec. If the chip requires a special way of reset, you can + the codec. If the chip requires a special kind of reset, you can define this callback. </para> <para> - The <structfield>wait</structfield> callback is used for a - certain wait at the standard initialization of the codec. If the - chip requires the extra wait-time, define this callback. + The <structfield>wait</structfield> callback is used to + add some waiting time in the standard initialization of the codec. If the + chip requires the extra waiting time, define this callback. </para> <para> @@ -4172,7 +4171,7 @@ struct _snd_pcm_runtime { <para> <function>snd_ac97_update_bits()</function> is used to update - some bits of the given register. + some bits in the given register. <informalexample> <programlisting> @@ -4185,7 +4184,7 @@ struct _snd_pcm_runtime { <para> Also, there is a function to change the sample rate (of a - certain register such as + given register such as <constant>AC97_PCM_FRONT_DAC_RATE</constant>) when VRA or DRA is supported by the codec: <function>snd_ac97_set_rate()</function>. @@ -4200,11 +4199,11 @@ struct _snd_pcm_runtime { </para> <para> - The following registers are available for setting the rate: + The following registers are available to set the rate: <constant>AC97_PCM_MIC_ADC_RATE</constant>, <constant>AC97_PCM_FRONT_DAC_RATE</constant>, <constant>AC97_PCM_LR_ADC_RATE</constant>, - <constant>AC97_SPDIF</constant>. When the + <constant>AC97_SPDIF</constant>. When <constant>AC97_SPDIF</constant> is specified, the register is not really changed but the corresponding IEC958 status bits will be updated. @@ -4214,12 +4213,11 @@ struct _snd_pcm_runtime { <section id="api-ac97-clock-adjustment"> <title>Clock Adjustment</title> <para> - On some chip, the clock of the codec isn't 48000 but using a + In some chips, the clock of the codec isn't 48000 but using a PCI clock (to save a quartz!). In this case, change the field bus->clock to the corresponding value. For example, intel8x0 - and es1968 drivers have the auto-measurement function of the - clock. + and es1968 drivers have their own function to read from the clock. </para> </section> @@ -4239,15 +4237,13 @@ struct _snd_pcm_runtime { When there are several codecs on the same card, you need to call <function>snd_ac97_mixer()</function> multiple times with ac97.num=1 or greater. The <structfield>num</structfield> field - specifies the codec - number. + specifies the codec number. </para> <para> - If you have set up multiple codecs, you need to either write + If you set up multiple codecs, you either need to write different callbacks for each codec or check - ac97->num in the - callback routines. + ac97->num in the callback routines. </para> </section> @@ -4271,7 +4267,7 @@ struct _snd_pcm_runtime { </para> <para> - Some soundchips have similar but a little bit different + Some soundchips have a similar but slightly different implementation of mpu401 stuff. For example, emu10k1 has its own mpu401 routines. </para> @@ -4280,7 +4276,7 @@ struct _snd_pcm_runtime { <section id="midi-interface-constructor"> <title>Constructor</title> <para> - For creating a rawmidi object, call + To create a rawmidi object, call <function>snd_mpu401_uart_new()</function>. <informalexample> @@ -4307,25 +4303,24 @@ struct _snd_pcm_runtime { </para> <para> - The 4th argument is the i/o port address. Many - backward-compatible MPU401 has an i/o port such as 0x330. Or, it - might be a part of its own PCI i/o region. It depends on the + The 4th argument is the I/O port address. Many + backward-compatible MPU401 have an I/O port such as 0x330. Or, it + might be a part of its own PCI I/O region. It depends on the chip design. </para> <para> - The 5th argument is bitflags for additional information. - When the i/o port address above is a part of the PCI i/o - region, the MPU401 i/o port might have been already allocated + The 5th argument is a bitflag for additional information. + When the I/O port address above is part of the PCI I/O + region, the MPU401 I/O port might have been already allocated (reserved) by the driver itself. In such a case, pass a bit flag <constant>MPU401_INFO_INTEGRATED</constant>, - and - the mpu401-uart layer will allocate the i/o ports by itself. + and the mpu401-uart layer will allocate the I/O ports by itself. </para> <para> When the controller supports only the input or output MIDI stream, - pass <constant>MPU401_INFO_INPUT</constant> or + pass the <constant>MPU401_INFO_INPUT</constant> or <constant>MPU401_INFO_OUTPUT</constant> bitflag, respectively. Then the rawmidi instance is created as a single stream. </para> @@ -4333,7 +4328,7 @@ struct _snd_pcm_runtime { <para> <constant>MPU401_INFO_MMIO</constant> bitflag is used to change the access method to MMIO (via readb and writeb) instead of - iob and outb. In this case, you have to pass the iomapped address + iob and outb. In this case, you have to pass the iomapped address to <function>snd_mpu401_uart_new()</function>. </para> @@ -4341,7 +4336,7 @@ struct _snd_pcm_runtime { When <constant>MPU401_INFO_TX_IRQ</constant> is set, the output stream isn't checked in the default interrupt handler. The driver needs to call <function>snd_mpu401_uart_interrupt_tx()</function> - by itself to start processing the output stream in irq handler. + by itself to start processing the output stream in the irq handler. </para> <para> @@ -4381,7 +4376,7 @@ struct _snd_pcm_runtime { (<parameter>irq_flags</parameter>). Otherwise, pass the flags for irq allocation (<constant>SA_XXX</constant> bits) to it, and the irq will be - reserved by the mpu401-uart layer. If the card doesn't generates + reserved by the mpu401-uart layer. If the card doesn't generate UART interrupts, pass -1 as the irq number. Then a timer interrupt will be invoked for polling. </para> @@ -4392,8 +4387,8 @@ struct _snd_pcm_runtime { <para> When the interrupt is allocated in <function>snd_mpu401_uart_new()</function>, the private - interrupt handler is used, hence you don't have to do nothing - else than creating the mpu401 stuff. Otherwise, you have to call + interrupt handler is used, hence you don't have anything else to do + than creating the mpu401 stuff. Otherwise, you have to call <function>snd_mpu401_uart_interrupt()</function> explicitly when a UART interrupt is invoked and checked in your own interrupt handler. @@ -4480,8 +4475,8 @@ struct _snd_pcm_runtime { <para> The fourth and fifth arguments are the number of output and - input substreams, respectively, of this device. (A substream is - the equivalent of a MIDI port.) + input substreams, respectively, of this device (a substream is + the equivalent of a MIDI port). </para> <para> @@ -4498,7 +4493,7 @@ struct _snd_pcm_runtime { <para> After the rawmidi device is created, you need to set the operators (callbacks) for each substream. There are helper - functions to set the operators for all substream of a device: + functions to set the operators for all the substreams of a device: <informalexample> <programlisting> <![CDATA[ @@ -4528,8 +4523,8 @@ struct _snd_pcm_runtime { </para> <para> - If there is more than one substream, you should give each one a - unique name: + If there are more than one substream, you should give a + unique name to each of them: <informalexample> <programlisting> <![CDATA[ @@ -4550,7 +4545,7 @@ struct _snd_pcm_runtime { <title>Callbacks</title> <para> - In all callbacks, the private data that you've set for the + In all the callbacks, the private data that you've set for the rawmidi device can be accessed as substream->rmidi->private_data. <!-- <code> isn't available before DocBook 4.3 --> @@ -4583,8 +4578,8 @@ struct _snd_pcm_runtime { <para> This is called when a substream is opened. - You can initialize the hardware here, but you should not yet - start transmitting/receiving data. + You can initialize the hardware here, but you shouldn't + start transmitting/receiving data yet. </para> </section> @@ -4632,9 +4627,9 @@ struct _snd_pcm_runtime { To read data from the buffer, call <function>snd_rawmidi_transmit_peek</function>. It will return the number of bytes that have been read; this will be - less than the number of bytes requested when there is no more + less than the number of bytes requested when there are no more data in the buffer. - After the data has been transmitted successfully, call + After the data have been transmitted successfully, call <function>snd_rawmidi_transmit_ack</function> to remove the data from the substream buffer: <informalexample> @@ -4655,7 +4650,7 @@ struct _snd_pcm_runtime { <para> If you know beforehand that the hardware will accept data, you can use the <function>snd_rawmidi_transmit</function> function - which reads some data and removes it from the buffer at once: + which reads some data and removes them from the buffer at once: <informalexample> <programlisting> <![CDATA[ @@ -4749,13 +4744,13 @@ struct _snd_pcm_runtime { <para> This is only used with output substreams. This function should wait - until all data read from the substream buffer has been transmitted. + until all data read from the substream buffer have been transmitted. This ensures that the device can be closed and the driver unloaded without losing data. </para> <para> - This callback is optional. If you do not set + This callback is optional. If you do not set <structfield>drain</structfield> in the struct snd_rawmidi_ops structure, ALSA will simply wait for 50 milliseconds instead. @@ -4775,24 +4770,24 @@ struct _snd_pcm_runtime { <section id="misc-devices-opl3"> <title>FM OPL3</title> <para> - The FM OPL3 is still used on many chips (mainly for backward + The FM OPL3 is still used in many chips (mainly for backward compatibility). ALSA has a nice OPL3 FM control layer, too. The OPL3 API is defined in <filename><sound/opl3.h></filename>. </para> <para> - FM registers can be directly accessed through direct-FM API, + FM registers can be directly accessed through the direct-FM API, defined in <filename><sound/asound_fm.h></filename>. In ALSA native mode, FM registers are accessed through - Hardware-Dependant Device direct-FM extension API, whereas in - OSS compatible mode, FM registers can be accessed with OSS - direct-FM compatible API on <filename>/dev/dmfmX</filename> device. + the Hardware-Dependant Device direct-FM extension API, whereas in + OSS compatible mode, FM registers can be accessed with the OSS + direct-FM compatible API in <filename>/dev/dmfmX</filename> device. </para> <para> - For creating the OPL3 component, you have two functions to - call. The first one is a constructor for <type>opl3_t</type> + To create the OPL3 component, you have two functions to + call. The first one is a constructor for the <type>opl3_t</type> instance. <informalexample> @@ -4819,12 +4814,12 @@ struct _snd_pcm_runtime { <para> When the left and right ports have been already allocated by the card driver, pass non-zero to the fifth argument - (<parameter>integrated</parameter>). Otherwise, opl3 module will + (<parameter>integrated</parameter>). Otherwise, the opl3 module will allocate the specified ports by itself. </para> <para> - When the accessing to the hardware requires special method + When the accessing the hardware requires special method instead of the standard I/O access, you can create opl3 instance separately with <function>snd_opl3_new()</function>. @@ -4845,13 +4840,13 @@ struct _snd_pcm_runtime { access function, the private data and the destructor. The l_port and r_port are not necessarily set. Only the command must be set properly. You can retrieve the data - from opl3->private_data field. + from the opl3->private_data field. </para> <para> After creating the opl3 instance via <function>snd_opl3_new()</function>, call <function>snd_opl3_init()</function> to initialize the chip to the - proper state. Note that <function>snd_opl3_create()</function> always + proper state. Note that <function>snd_opl3_create()</function> always calls it internally. </para> @@ -4884,7 +4879,7 @@ struct _snd_pcm_runtime { <section id="misc-devices-hardware-dependent"> <title>Hardware-Dependent Devices</title> <para> - Some chips need the access from the user-space for special + Some chips need user-space access for special controls or for loading the micro code. In such a case, you can create a hwdep (hardware-dependent) device. The hwdep API is defined in <filename><sound/hwdep.h></filename>. You can @@ -4893,7 +4888,7 @@ struct _snd_pcm_runtime { </para> <para> - Creation of the <type>hwdep</type> instance is done via + The creation of the <type>hwdep</type> instance is done via <function>snd_hwdep_new()</function>. <informalexample> @@ -4912,8 +4907,8 @@ struct _snd_pcm_runtime { You can then pass any pointer value to the <parameter>private_data</parameter>. If you assign a private data, you should define the - destructor, too. The destructor function is set to - <structfield>private_free</structfield> field. + destructor, too. The destructor function is set in + the <structfield>private_free</structfield> field. <informalexample> <programlisting> @@ -4925,7 +4920,7 @@ struct _snd_pcm_runtime { </programlisting> </informalexample> - and the implementation of destructor would be: + and the implementation of the destructor would be: <informalexample> <programlisting> @@ -4943,7 +4938,7 @@ struct _snd_pcm_runtime { <para> The arbitrary file operations can be defined for this instance. The file operators are defined in - <parameter>ops</parameter> table. For example, assume that + the <parameter>ops</parameter> table. For example, assume that this chip needs an ioctl. <informalexample> @@ -4964,7 +4959,7 @@ struct _snd_pcm_runtime { <title>IEC958 (S/PDIF)</title> <para> Usually the controls for IEC958 devices are implemented via - control interface. There is a macro to compose a name string for + the control interface. There is a macro to compose a name string for IEC958 controls, <function>SNDRV_CTL_NAME_IEC958()</function> defined in <filename><include/asound.h></filename>. </para> @@ -4973,7 +4968,7 @@ struct _snd_pcm_runtime { There are some standard controls for IEC958 status bits. These controls use the type <type>SNDRV_CTL_ELEM_TYPE_IEC958</type>, and the size of element is fixed as 4 bytes array - (value.iec958.status[x]). For <structfield>info</structfield> + (value.iec958.status[x]). For the <structfield>info</structfield> callback, you don't specify the value field for this type (the count field must be set, though). @@ -5001,7 +4996,7 @@ struct _snd_pcm_runtime { enable/disable or to set the raw bit mode. The implementation will depend on the chip, but the control should be named as <quote>IEC958 xxx</quote>, preferably using - <function>SNDRV_CTL_NAME_IEC958()</function> macro. + the <function>SNDRV_CTL_NAME_IEC958()</function> macro. </para> <para> @@ -5036,12 +5031,12 @@ struct _snd_pcm_runtime { The allocation of pages with fallback is <function>snd_malloc_xxx_pages_fallback()</function>. This function tries to allocate the specified pages but if the pages - are not available, it tries to reduce the page sizes until the + are not available, it tries to reduce the page sizes until enough space is found. </para> <para> - For releasing the space, call + The release the pages, call <function>snd_free_xxx_pages()</function> function. </para> @@ -5050,8 +5045,8 @@ struct _snd_pcm_runtime { a large contiguous physical space at the time the module is loaded for the later use. This is called <quote>pre-allocation</quote>. - As already written, you can call the following function at the - construction of pcm instance (in the case of PCI bus). + As already written, you can call the following function at + pcm instance construction time (in the case of PCI bus). <informalexample> <programlisting> @@ -5063,34 +5058,34 @@ struct _snd_pcm_runtime { </informalexample> where <parameter>size</parameter> is the byte size to be - pre-allocated and the <parameter>max</parameter> is the maximal - size to be changed via <filename>prealloc</filename> proc file. - The allocator will try to get as large area as possible + pre-allocated and the <parameter>max</parameter> is the maximum + size to be changed via the <filename>prealloc</filename> proc file. + The allocator will try to get an area as large as possible within the given size. </para> <para> The second argument (type) and the third argument (device pointer) are dependent on the bus. - In the case of ISA bus, pass <function>snd_dma_isa_data()</function> + In the case of the ISA bus, pass <function>snd_dma_isa_data()</function> as the third argument with <constant>SNDRV_DMA_TYPE_DEV</constant> type. For the continuous buffer unrelated to the bus can be pre-allocated with <constant>SNDRV_DMA_TYPE_CONTINUOUS</constant> type and the <function>snd_dma_continuous_data(GFP_KERNEL)</function> device pointer, - whereh <constant>GFP_KERNEL</constant> is the kernel allocation flag to + where <constant>GFP_KERNEL</constant> is the kernel allocation flag to use. For the SBUS, <constant>SNDRV_DMA_TYPE_SBUS</constant> and <function>snd_dma_sbus_data(sbus_dev)</function> are used instead. For the PCI scatter-gather buffers, use <constant>SNDRV_DMA_TYPE_DEV_SG</constant> with <function>snd_dma_pci_data(pci)</function> - (see the section + (see the <link linkend="buffer-and-memory-non-contiguous"><citetitle>Non-Contiguous Buffers - </citetitle></link>). + </citetitle></link> section). </para> <para> - Once when the buffer is pre-allocated, you can use the - allocator in the <structfield>hw_params</structfield> callback + Once the buffer is pre-allocated, you can use the + allocator in the <structfield>hw_params</structfield> callback: <informalexample> <programlisting> @@ -5116,8 +5111,8 @@ struct _snd_pcm_runtime { </para> <para> - The first case works fine if the external hardware buffer is enough - large. This method doesn't need any extra buffers and thus is + The first case works fine if the external hardware buffer is large + enough. This method doesn't need any extra buffers and thus is more effective. You need to define the <structfield>copy</structfield> and <structfield>silence</structfield> callbacks for @@ -5127,25 +5122,25 @@ struct _snd_pcm_runtime { </para> <para> - The second case allows the mmap of the buffer, although you have - to handle an interrupt or a tasklet for transferring the data + The second case allows for mmap on the buffer, although you have + to handle an interrupt or a tasklet to transfer the data from the intermediate buffer to the hardware buffer. You can find an - example in vxpocket driver. + example in the vxpocket driver. </para> <para> - Another case is that the chip uses a PCI memory-map + Another case is when the chip uses a PCI memory-map region for the buffer instead of the host memory. In this case, - mmap is available only on certain architectures like intel. In - non-mmap mode, the data cannot be transferred as the normal - way. Thus you need to define <structfield>copy</structfield> and - <structfield>silence</structfield> callbacks as well + mmap is available only on certain architectures like the Intel one. + In non-mmap mode, the data cannot be transferred as in the normal + way. Thus you need to define the <structfield>copy</structfield> and + <structfield>silence</structfield> callbacks as well, as in the cases above. The examples are found in <filename>rme32.c</filename> and <filename>rme96.c</filename>. </para> <para> - The implementation of <structfield>copy</structfield> and + The implementation of the <structfield>copy</structfield> and <structfield>silence</structfield> callbacks depends upon whether the hardware supports interleaved or non-interleaved samples. The <structfield>copy</structfield> callback is @@ -5184,8 +5179,8 @@ struct _snd_pcm_runtime { <para> What you have to do in this callback is again different - between playback and capture directions. In the case of - playback, you do: copy the given amount of data + between playback and capture directions. In the + playback case, you copy the given amount of data (<parameter>count</parameter>) at the specified pointer (<parameter>src</parameter>) to the specified offset (<parameter>pos</parameter>) on the hardware buffer. When @@ -5202,7 +5197,7 @@ struct _snd_pcm_runtime { </para> <para> - For the capture direction, you do: copy the given amount of + For the capture direction, you copy the given amount of data (<parameter>count</parameter>) at the specified offset (<parameter>pos</parameter>) on the hardware buffer to the specified pointer (<parameter>dst</parameter>). @@ -5216,7 +5211,7 @@ struct _snd_pcm_runtime { </programlisting> </informalexample> - Note that both of the position and the data amount are given + Note that both the position and the amount of data are given in frames. </para> @@ -5247,7 +5242,7 @@ struct _snd_pcm_runtime { </para> <para> - The meanings of arguments are identical with the + The meanings of arguments are the same as in the <structfield>copy</structfield> callback, although there is no <parameter>src/dst</parameter> argument. In the case of interleaved samples, the channel @@ -5284,8 +5279,8 @@ struct _snd_pcm_runtime { <section id="buffer-and-memory-non-contiguous"> <title>Non-Contiguous Buffers</title> <para> - If your hardware supports the page table like emu10k1 or the - buffer descriptors like via82xx, you can use the scatter-gather + If your hardware supports the page table as in emu10k1 or the + buffer descriptors as in via82xx, you can use the scatter-gather (SG) DMA. ALSA provides an interface for handling SG-buffers. The API is provided in <filename><sound/pcm.h></filename>. </para> @@ -5296,7 +5291,7 @@ struct _snd_pcm_runtime { <function>snd_pcm_lib_preallocate_pages_for_all()</function> with <constant>SNDRV_DMA_TYPE_DEV_SG</constant> in the PCM constructor like other PCI pre-allocator. - You need to pass the <function>snd_dma_pci_data(pci)</function>, + You need to pass <function>snd_dma_pci_data(pci)</function>, where pci is the struct <structname>pci_dev</structname> pointer of the chip as well. The <type>struct snd_sg_buf</type> instance is created as @@ -5314,7 +5309,7 @@ struct _snd_pcm_runtime { <para> Then call <function>snd_pcm_lib_malloc_pages()</function> - in <structfield>hw_params</structfield> callback + in the <structfield>hw_params</structfield> callback as well as in the case of normal PCI buffer. The SG-buffer handler will allocate the non-contiguous kernel pages of the given size and map them onto the virtually contiguous @@ -5335,7 +5330,7 @@ struct _snd_pcm_runtime { </para> <para> - For releasing the data, call + To release the data, call <function>snd_pcm_lib_free_pages()</function> in the <structfield>hw_free</structfield> callback as usual. </para> @@ -5390,7 +5385,7 @@ struct _snd_pcm_runtime { </para> <para> - For creating a proc file, call + To create a proc file, call <function>snd_card_proc_new()</function>. <informalexample> @@ -5402,7 +5397,7 @@ struct _snd_pcm_runtime { </programlisting> </informalexample> - where the second argument specifies the proc-file name to be + where the second argument specifies the name of the proc file to be created. The above example will create a file <filename>my-file</filename> under the card directory, e.g. <filename>/proc/asound/card0/my-file</filename>. @@ -5417,8 +5412,8 @@ struct _snd_pcm_runtime { <para> When the creation is successful, the function stores a new - instance at the pointer given in the third argument. - It is initialized as a text proc file for read only. For using + instance in the pointer given in the third argument. + It is initialized as a text proc file for read only. To use this proc file as a read-only text file as it is, set the read callback with a private data via <function>snd_info_set_text_ops()</function>. @@ -5470,9 +5465,9 @@ struct _snd_pcm_runtime { </para> <para> - The file permission can be changed afterwards. As default, it's - set as read only for all users. If you want to add the write - permission to the user (root as default), set like below: + The file permissions can be changed afterwards. As default, it's + set as read only for all users. If you want to add write + permission for the user (root as default), do as follows: <informalexample> <programlisting> @@ -5503,7 +5498,7 @@ struct _snd_pcm_runtime { </para> <para> - For a raw-data proc-file, set the attributes like the following: + For a raw-data proc-file, set the attributes as follows: <informalexample> <programlisting> @@ -5524,7 +5519,7 @@ struct _snd_pcm_runtime { <para> The callback is much more complicated than the text-file - version. You need to use a low-level i/o functions such as + version. You need to use a low-level I/O functions such as <function>copy_from/to_user()</function> to transfer the data. @@ -5560,28 +5555,28 @@ struct _snd_pcm_runtime { <title>Power Management</title> <para> If the chip is supposed to work with suspend/resume - functions, you need to add the power-management codes to the - driver. The additional codes for the power-management should be + functions, you need to add power-management code to the + driver. The additional code for power-management should be <function>ifdef</function>'ed with <constant>CONFIG_PM</constant>. </para> <para> - If the driver supports the suspend/resume - <emphasis>fully</emphasis>, that is, the device can be - properly resumed to the status at the suspend is called, - you can set <constant>SNDRV_PCM_INFO_RESUME</constant> flag - to pcm info field. Usually, this is possible when the - registers of ths chip can be safely saved and restored to the - RAM. If this is set, the trigger callback is called with - <constant>SNDRV_PCM_TRIGGER_RESUME</constant> after resume - callback is finished. + If the driver <emphasis>fully</emphasis> supports suspend/resume + that is, the device can be + properly resumed to its state when suspend was called, + you can set the <constant>SNDRV_PCM_INFO_RESUME</constant> flag + in the pcm info field. Usually, this is possible when the + registers of the chip can be safely saved and restored to + RAM. If this is set, the trigger callback is called with + <constant>SNDRV_PCM_TRIGGER_RESUME</constant> after the resume + callback completes. </para> <para> - Even if the driver doesn't support PM fully but only the - partial suspend/resume is possible, it's still worthy to - implement suspend/resume callbacks. In such a case, applications + Even if the driver doesn't support PM fully but + partial suspend/resume is still possible, it's still worthy to + implement suspend/resume callbacks. In such a case, applications would reset the status by calling <function>snd_pcm_prepare()</function> and restart the stream appropriately. Hence, you can define suspend/resume callbacks @@ -5590,22 +5585,22 @@ struct _snd_pcm_runtime { </para> <para> - Note that the trigger with SUSPEND can be always called when + Note that the trigger with SUSPEND can always be called when <function>snd_pcm_suspend_all</function> is called, - regardless of <constant>SNDRV_PCM_INFO_RESUME</constant> flag. + regardless of the <constant>SNDRV_PCM_INFO_RESUME</constant> flag. The <constant>RESUME</constant> flag affects only the behavior of <function>snd_pcm_resume()</function>. (Thus, in theory, <constant>SNDRV_PCM_TRIGGER_RESUME</constant> isn't needed to be handled in the trigger callback when no <constant>SNDRV_PCM_INFO_RESUME</constant> flag is set. But, - it's better to keep it for compatibility reason.) + it's better to keep it for compatibility reasons.) </para> <para> In the earlier version of ALSA drivers, a common power-management layer was provided, but it has been removed. The driver needs to define the suspend/resume hooks according to - the bus the device is assigned. In the case of PCI driver, the + the bus the device is connected to. In the case of PCI drivers, the callbacks look like below: <informalexample> @@ -5629,7 +5624,7 @@ struct _snd_pcm_runtime { </para> <para> - The scheme of the real suspend job is as following. + The scheme of the real suspend job is as follows. <orderedlist> <listitem><para>Retrieve the card and the chip data.</para></listitem> @@ -5679,11 +5674,11 @@ struct _snd_pcm_runtime { </para> <para> - The scheme of the real resume job is as following. + The scheme of the real resume job is as follows. <orderedlist> <listitem><para>Retrieve the card and the chip data.</para></listitem> - <listitem><para>Set up PCI. First, call <function>pci_restore_state()</function>. + <listitem><para>Set up PCI. First, call <function>pci_restore_state()</function>. Then enable the pci device again by calling <function>pci_enable_device()</function>. Call <function>pci_set_master()</function> if necessary, too.</para></listitem> <listitem><para>Re-initialize the chip.</para></listitem> @@ -5734,7 +5729,7 @@ struct _snd_pcm_runtime { <function>snd_pcm_suspend_all()</function> or <function>snd_pcm_suspend()</function>. It means that the PCM streams are already stoppped when the register snapshot is - taken. But, remind that you don't have to restart the PCM + taken. But, remember that you don't have to restart the PCM stream in the resume callback. It'll be restarted via trigger call with <constant>SNDRV_PCM_TRIGGER_RESUME</constant> when necessary. @@ -5795,7 +5790,7 @@ struct _snd_pcm_runtime { </para> <para> - If you need a space for saving the registers, allocate the + If you need a space to save the registers, allocate the buffer for it here, too, since it would be fatal if you cannot allocate a memory in the suspend phase. The allocated buffer should be released in the corresponding @@ -5833,7 +5828,7 @@ struct _snd_pcm_runtime { <title>Module Parameters</title> <para> There are standard module options for ALSA. At least, each - module should have <parameter>index</parameter>, + module should have the <parameter>index</parameter>, <parameter>id</parameter> and <parameter>enable</parameter> options. </para> @@ -5841,8 +5836,8 @@ struct _snd_pcm_runtime { <para> If the module supports multiple cards (usually up to 8 = <constant>SNDRV_CARDS</constant> cards), they should be - arrays. The default initial values are defined already as - constants for ease of programming: + arrays. The default initial values are defined already as + constants for easier programming: <informalexample> <programlisting> @@ -5858,7 +5853,7 @@ struct _snd_pcm_runtime { <para> If the module supports only a single card, they could be single variables, instead. <parameter>enable</parameter> option is not - always necessary in this case, but it wouldn't be so bad to have a + always necessary in this case, but it would be better to have a dummy option for compatibility. </para> @@ -5923,22 +5918,22 @@ struct _snd_pcm_runtime { </para> <para> - Suppose that you'll create a new PCI driver for the card + Suppose that you create a new PCI driver for the card <quote>xyz</quote>. The card module name would be - snd-xyz. The new driver is usually put into alsa-driver + snd-xyz. The new driver is usually put into the alsa-driver tree, <filename>alsa-driver/pci</filename> directory in the case of PCI cards. Then the driver is evaluated, audited and tested by developers and users. After a certain time, the driver - will go to alsa-kernel tree (to the corresponding directory, + will go to the alsa-kernel tree (to the corresponding directory, such as <filename>alsa-kernel/pci</filename>) and eventually - integrated into Linux 2.6 tree (the directory would be + will be integrated into the Linux 2.6 tree (the directory would be <filename>linux/sound/pci</filename>). </para> <para> In the following sections, the driver code is supposed - to be put into alsa-driver tree. The two cases are assumed: + to be put into alsa-driver tree. The two cases are covered: a driver consisting of a single source file and one consisting of several source files. </para> @@ -6033,7 +6028,7 @@ struct _snd_pcm_runtime { <listitem> <para> Add a new directory (<filename>xyz</filename>) in - <filename>alsa-driver/pci/Makefile</filename> like below + <filename>alsa-driver/pci/Makefile</filename> as below <informalexample> <programlisting> @@ -6102,7 +6097,7 @@ struct _snd_pcm_runtime { <section id="useful-functions-snd-printk"> <title><function>snd_printk()</function> and friends</title> <para> - ALSA provides a verbose version of + ALSA provides a verbose version of the <function>printk()</function> function. If a kernel config <constant>CONFIG_SND_VERBOSE_PRINTK</constant> is set, this function prints the given message together with the file name @@ -6170,7 +6165,7 @@ struct _snd_pcm_runtime { <section id="useful-functions-snd-bug"> <title><function>snd_BUG()</function></title> <para> - It shows <computeroutput>BUG?</computeroutput> message and + It shows the <computeroutput>BUG?</computeroutput> message and stack trace as well as <function>snd_assert</function> at the point. It's useful to show that a fatal error happens there. </para> @@ -6199,6 +6194,4 @@ struct _snd_pcm_runtime { in the hardware constraints section. </para> </chapter> - - </book> diff --git a/Documentation/sound/alsa/soc/DAI.txt b/Documentation/sound/alsa/soc/DAI.txt index 3feeb9ecdec4..0ebd7ea9706c 100644 --- a/Documentation/sound/alsa/soc/DAI.txt +++ b/Documentation/sound/alsa/soc/DAI.txt @@ -1,5 +1,5 @@ ASoC currently supports the three main Digital Audio Interfaces (DAI) found on -SoC controllers and portable audio CODECS today, namely AC97, I2S and PCM. +SoC controllers and portable audio CODECs today, namely AC97, I2S and PCM. AC97 @@ -25,7 +25,7 @@ left/right clock (LRC) synchronise the link. I2S is flexible in that either the controller or CODEC can drive (master) the BCLK and LRC clock lines. Bit clock usually varies depending on the sample rate and the master system clock (SYSCLK). LRCLK is the same as the sample rate. A few devices support separate -ADC and DAC LRCLK's, this allows for simultaneous capture and playback at +ADC and DAC LRCLKs, this allows for simultaneous capture and playback at different sample rates. I2S has several different operating modes:- @@ -35,7 +35,7 @@ I2S has several different operating modes:- o Left Justified - MSB is transmitted on transition of LRC. - o Right Justified - MSB is transmitted sample size BCLK's before LRC + o Right Justified - MSB is transmitted sample size BCLKs before LRC transition. PCM diff --git a/Documentation/sound/alsa/soc/clocking.txt b/Documentation/sound/alsa/soc/clocking.txt index 14930887c25f..b1300162e01c 100644 --- a/Documentation/sound/alsa/soc/clocking.txt +++ b/Documentation/sound/alsa/soc/clocking.txt @@ -13,7 +13,7 @@ or SYSCLK). This audio master clock can be derived from a number of sources (e.g. crystal, PLL, CPU clock) and is responsible for producing the correct audio playback and capture sample rates. -Some master clocks (e.g. PLL's and CPU based clocks) are configurable in that +Some master clocks (e.g. PLLs and CPU based clocks) are configurable in that their speed can be altered by software (depending on the system use and to save power). Other master clocks are fixed at a set frequency (i.e. crystals). @@ -41,11 +41,11 @@ BCLK = LRC * x BCLK = LRC * Channels * Word Size This relationship depends on the codec or SoC CPU in particular. In general -it's best to configure BCLK to the lowest possible speed (depending on your -rate, number of channels and wordsize) to save on power. +it is best to configure BCLK to the lowest possible speed (depending on your +rate, number of channels and word size) to save on power. -It's also desirable to use the codec (if possible) to drive (or master) the -audio clocks as it's usually gives more accurate sample rates than the CPU. +It is also desirable to use the codec (if possible) to drive (or master) the +audio clocks as it usually gives more accurate sample rates than the CPU. diff --git a/Documentation/sound/alsa/soc/codec.txt b/Documentation/sound/alsa/soc/codec.txt index 1e766ad0ebd1..1e95342ed72e 100644 --- a/Documentation/sound/alsa/soc/codec.txt +++ b/Documentation/sound/alsa/soc/codec.txt @@ -9,7 +9,7 @@ code should be added to the platform and machine drivers respectively. Each codec driver *must* provide the following features:- 1) Codec DAI and PCM configuration - 2) Codec control IO - using I2C, 3 Wire(SPI) or both API's + 2) Codec control IO - using I2C, 3 Wire(SPI) or both APIs 3) Mixers and audio controls 4) Codec audio operations @@ -19,7 +19,7 @@ Optionally, codec drivers can also provide:- 6) DAPM event handler. 7) DAC Digital mute control. -It's probably best to use this guide in conjunction with the existing codec +Its probably best to use this guide in conjunction with the existing codec driver code in sound/soc/codecs/ ASoC Codec driver breakdown @@ -27,8 +27,8 @@ ASoC Codec driver breakdown 1 - Codec DAI and PCM configuration ----------------------------------- -Each codec driver must have a struct snd_soc_codec_dai to define it's DAI and -PCM's capabilities and operations. This struct is exported so that it can be +Each codec driver must have a struct snd_soc_codec_dai to define its DAI and +PCM capabilities and operations. This struct is exported so that it can be registered with the core by your machine driver. e.g. @@ -67,18 +67,18 @@ EXPORT_SYMBOL_GPL(wm8731_dai); 2 - Codec control IO -------------------- -The codec can usually be controlled via an I2C or SPI style interface (AC97 -combines control with data in the DAI). The codec drivers will have to provide -functions to read and write the codec registers along with supplying a register -cache:- +The codec can usually be controlled via an I2C or SPI style interface +(AC97 combines control with data in the DAI). The codec drivers provide +functions to read and write the codec registers along with supplying a +register cache:- /* IO control data and register cache */ - void *control_data; /* codec control (i2c/3wire) data */ - void *reg_cache; + void *control_data; /* codec control (i2c/3wire) data */ + void *reg_cache; -Codec read/write should do any data formatting and call the hardware read write -below to perform the IO. These functions are called by the core and alsa when -performing DAPM or changing the mixer:- +Codec read/write should do any data formatting and call the hardware +read write below to perform the IO. These functions are called by the +core and ALSA when performing DAPM or changing the mixer:- unsigned int (*read)(struct snd_soc_codec *, unsigned int); int (*write)(struct snd_soc_codec *, unsigned int, unsigned int); @@ -131,7 +131,7 @@ Defines a stereo enumerated control 4 - Codec Audio Operations -------------------------- -The codec driver also supports the following alsa operations:- +The codec driver also supports the following ALSA operations:- /* SoC audio ops */ struct snd_soc_ops { @@ -142,15 +142,15 @@ struct snd_soc_ops { int (*prepare)(struct snd_pcm_substream *); }; -Please refer to the alsa driver PCM documentation for details. +Please refer to the ALSA driver PCM documentation for details. http://www.alsa-project.org/~iwai/writing-an-alsa-driver/c436.htm 5 - DAPM description. --------------------- -The Dynamic Audio Power Management description describes the codec's power -components, their relationships and registers to the ASoC core. Please read -dapm.txt for details of building the description. +The Dynamic Audio Power Management description describes the codec power +components and their relationships and registers to the ASoC core. +Please read dapm.txt for details of building the description. Please also see the examples in other codec drivers. @@ -158,8 +158,8 @@ Please also see the examples in other codec drivers. 6 - DAPM event handler ---------------------- This function is a callback that handles codec domain PM calls and system -domain PM calls (e.g. suspend and resume). It's used to put the codec to sleep -when not in use. +domain PM calls (e.g. suspend and resume). It is used to put the codec +to sleep when not in use. Power states:- @@ -175,13 +175,14 @@ Power states:- SNDRV_CTL_POWER_D3cold: /* Everything Off, without power */ -7 - Codec DAC digital mute control. ------------------------------------- -Most codecs have a digital mute before the DAC's that can be used to minimise -any system noise. The mute stops any digital data from entering the DAC. +7 - Codec DAC digital mute control +---------------------------------- +Most codecs have a digital mute before the DACs that can be used to +minimise any system noise. The mute stops any digital data from +entering the DAC. -A callback can be created that is called by the core for each codec DAI when the -mute is applied or freed. +A callback can be created that is called by the core for each codec DAI +when the mute is applied or freed. i.e. diff --git a/Documentation/sound/alsa/soc/dapm.txt b/Documentation/sound/alsa/soc/dapm.txt index ab0766fd7869..c784a18b94dc 100644 --- a/Documentation/sound/alsa/soc/dapm.txt +++ b/Documentation/sound/alsa/soc/dapm.txt @@ -4,20 +4,20 @@ Dynamic Audio Power Management for Portable Devices 1. Description ============== -Dynamic Audio Power Management (DAPM) is designed to allow portable Linux devices -to use the minimum amount of power within the audio subsystem at all times. It -is independent of other kernel PM and as such, can easily co-exist with the -other PM systems. +Dynamic Audio Power Management (DAPM) is designed to allow portable +Linux devices to use the minimum amount of power within the audio +subsystem at all times. It is independent of other kernel PM and as +such, can easily co-exist with the other PM systems. -DAPM is also completely transparent to all user space applications as all power -switching is done within the ASoC core. No code changes or recompiling are -required for user space applications. DAPM makes power switching decisions based -upon any audio stream (capture/playback) activity and audio mixer settings -within the device. +DAPM is also completely transparent to all user space applications as +all power switching is done within the ASoC core. No code changes or +recompiling are required for user space applications. DAPM makes power +switching decisions based upon any audio stream (capture/playback) +activity and audio mixer settings within the device. -DAPM spans the whole machine. It covers power control within the entire audio -subsystem, this includes internal codec power blocks and machine level power -systems. +DAPM spans the whole machine. It covers power control within the entire +audio subsystem, this includes internal codec power blocks and machine +level power systems. There are 4 power domains within DAPM @@ -34,7 +34,7 @@ There are 4 power domains within DAPM Automatically set when mixer and mux settings are changed by the user. e.g. alsamixer, amixer. - 4. Stream domain - DAC's and ADC's. + 4. Stream domain - DACs and ADCs. Enabled and disabled when stream playback/capture is started and stopped respectively. e.g. aplay, arecord. @@ -51,7 +51,7 @@ widgets hereafter. Audio DAPM widgets fall into a number of types:- o Mixer - Mixes several analog signals into a single analog signal. - o Mux - An analog switch that outputs only 1 of it's inputs. + o Mux - An analog switch that outputs only one of many inputs. o PGA - A programmable gain amplifier or attenuation widget. o ADC - Analog to Digital Converter o DAC - Digital to Analog Converter @@ -78,14 +78,14 @@ parameters for stream name and kcontrols. 2.1 Stream Domain Widgets ------------------------- -Stream Widgets relate to the stream power domain and only consist of ADC's -(analog to digital converters) and DAC's (digital to analog converters). +Stream Widgets relate to the stream power domain and only consist of ADCs +(analog to digital converters) and DACs (digital to analog converters). Stream widgets have the following format:- SND_SOC_DAPM_DAC(name, stream name, reg, shift, invert), -NOTE: the stream name must match the corresponding stream name in your codecs +NOTE: the stream name must match the corresponding stream name in your codec snd_soc_codec_dai. e.g. stream widgets for HiFi playback and capture @@ -97,7 +97,7 @@ SND_SOC_DAPM_ADC("HiFi ADC", "HiFi Capture", REG, 2, 1), 2.2 Path Domain Widgets ----------------------- -Path domain widgets have a ability to control or effect the audio signal or +Path domain widgets have a ability to control or affect the audio signal or audio paths within the audio subsystem. They have the following form:- SND_SOC_DAPM_PGA(name, reg, shift, invert, controls, num_controls) @@ -149,7 +149,7 @@ SND_SOC_DAPM_MIC("Mic Jack", spitz_mic_bias), 2.4 Codec Domain ---------------- -The Codec power domain has no widgets and is handled by the codecs DAPM event +The codec power domain has no widgets and is handled by the codecs DAPM event handler. This handler is called when the codec powerstate is changed wrt to any stream event or by kernel PM events. @@ -158,8 +158,8 @@ stream event or by kernel PM events. ------------------- Sometimes widgets exist in the codec or machine audio map that don't have any -corresponding register bit for power control. In this case it's necessary to -create a virtual widget - a widget with no control bits e.g. +corresponding soft power control. In this case it is necessary to create +a virtual widget - a widget with no control bits e.g. SND_SOC_DAPM_MIXER("AC97 Mixer", SND_SOC_DAPM_NOPM, 0, 0, NULL, 0), @@ -172,13 +172,14 @@ subsystem individually with a call to snd_soc_dapm_new_control(). 3. Codec Widget Interconnections ================================ -Widgets are connected to each other within the codec and machine by audio -paths (called interconnections). Each interconnection must be defined in order -to create a map of all audio paths between widgets. +Widgets are connected to each other within the codec and machine by audio paths +(called interconnections). Each interconnection must be defined in order to +create a map of all audio paths between widgets. + This is easiest with a diagram of the codec (and schematic of the machine audio system), as it requires joining widgets together via their audio signal paths. -i.e. from the WM8731 codec's output mixer (wm8731.c) +e.g., from the WM8731 output mixer (wm8731.c) The WM8731 output mixer has 3 inputs (sources) diff --git a/Documentation/sound/alsa/soc/machine.txt b/Documentation/sound/alsa/soc/machine.txt index 72bd222f2a21..f370e7db86af 100644 --- a/Documentation/sound/alsa/soc/machine.txt +++ b/Documentation/sound/alsa/soc/machine.txt @@ -16,7 +16,7 @@ struct snd_soc_machine { int (*remove)(struct platform_device *pdev); /* the pre and post PM functions are used to do any PM work before and - * after the codec and DAI's do any PM work. */ + * after the codec and DAIs do any PM work. */ int (*suspend_pre)(struct platform_device *pdev, pm_message_t state); int (*suspend_post)(struct platform_device *pdev, pm_message_t state); int (*resume_pre)(struct platform_device *pdev); @@ -38,7 +38,7 @@ probe/remove are optional. Do any machine specific probe here. suspend()/resume() ------------------ The machine driver has pre and post versions of suspend and resume to take care -of any machine audio tasks that have to be done before or after the codec, DAI's +of any machine audio tasks that have to be done before or after the codec, DAIs and DMA is suspended and resumed. Optional. @@ -49,10 +49,10 @@ The machine specific audio operations can be set here. Again this is optional. Machine DAI Configuration ------------------------- -The machine DAI configuration glues all the codec and CPU DAI's together. It can +The machine DAI configuration glues all the codec and CPU DAIs together. It can also be used to set up the DAI system clock and for any machine related DAI initialisation e.g. the machine audio map can be connected to the codec audio -map, unconnnected codec pins can be set as such. Please see corgi.c, spitz.c +map, unconnected codec pins can be set as such. Please see corgi.c, spitz.c for examples. struct snd_soc_dai_link is used to set up each DAI in your machine. e.g. @@ -67,7 +67,7 @@ static struct snd_soc_dai_link corgi_dai = { .ops = &corgi_ops, }; -struct snd_soc_machine then sets up the machine with it's DAI's. e.g. +struct snd_soc_machine then sets up the machine with it's DAIs. e.g. /* corgi audio machine driver */ static struct snd_soc_machine snd_soc_machine_corgi = { @@ -110,4 +110,4 @@ details. Machine Controls ---------------- -Machine specific audio mixer controls can be added in the dai init function.
\ No newline at end of file +Machine specific audio mixer controls can be added in the DAI init function. diff --git a/Documentation/sound/alsa/soc/overview.txt b/Documentation/sound/alsa/soc/overview.txt index c47ce9530677..1e4c6d3655f2 100644 --- a/Documentation/sound/alsa/soc/overview.txt +++ b/Documentation/sound/alsa/soc/overview.txt @@ -1,25 +1,26 @@ ALSA SoC Layer ============== -The overall project goal of the ALSA System on Chip (ASoC) layer is to provide -better ALSA support for embedded system-on-chip processors (e.g. pxa2xx, au1x00, -iMX, etc) and portable audio codecs. Currently there is some support in the -kernel for SoC audio, however it has some limitations:- +The overall project goal of the ALSA System on Chip (ASoC) layer is to +provide better ALSA support for embedded system-on-chip processors (e.g. +pxa2xx, au1x00, iMX, etc) and portable audio codecs. Prior to the ASoC +subsystem there was some support in the kernel for SoC audio, however it +had some limitations:- - * Currently, codec drivers are often tightly coupled to the underlying SoC - CPU. This is not ideal and leads to code duplication i.e. Linux now has 4 - different wm8731 drivers for 4 different SoC platforms. + * Codec drivers were often tightly coupled to the underlying SoC + CPU. This is not ideal and leads to code duplication - for example, + Linux had different wm8731 drivers for 4 different SoC platforms. - * There is no standard method to signal user initiated audio events (e.g. + * There was no standard method to signal user initiated audio events (e.g. Headphone/Mic insertion, Headphone/Mic detection after an insertion event). These are quite common events on portable devices and often require machine specific code to re-route audio, enable amps, etc., after such an event. - * Current drivers tend to power up the entire codec when playing - (or recording) audio. This is fine for a PC, but tends to waste a lot of - power on portable devices. There is also no support for saving power via - changing codec oversampling rates, bias currents, etc. + * Drivers tended to power up the entire codec when playing (or + recording) audio. This is fine for a PC, but tends to waste a lot of + power on portable devices. There was also no support for saving + power via changing codec oversampling rates, bias currents, etc. ASoC Design @@ -31,12 +32,13 @@ features :- * Codec independence. Allows reuse of codec drivers on other platforms and machines. - * Easy I2S/PCM audio interface setup between codec and SoC. Each SoC interface - and codec registers it's audio interface capabilities with the core and are - subsequently matched and configured when the application hw params are known. + * Easy I2S/PCM audio interface setup between codec and SoC. Each SoC + interface and codec registers it's audio interface capabilities with the + core and are subsequently matched and configured when the application + hardware parameters are known. * Dynamic Audio Power Management (DAPM). DAPM automatically sets the codec to - it's minimum power state at all times. This includes powering up/down + its minimum power state at all times. This includes powering up/down internal power blocks depending on the internal codec audio routing and any active streams. @@ -45,16 +47,16 @@ features :- signals the codec when to change power states. * Machine specific controls: Allow machines to add controls to the sound card - (e.g. volume control for speaker amp). + (e.g. volume control for speaker amplifier). To achieve all this, ASoC basically splits an embedded audio system into 3 components :- * Codec driver: The codec driver is platform independent and contains audio - controls, audio interface capabilities, codec dapm definition and codec IO + controls, audio interface capabilities, codec DAPM definition and codec IO functions. - * Platform driver: The platform driver contains the audio dma engine and audio + * Platform driver: The platform driver contains the audio DMA engine and audio interface drivers (e.g. I2S, AC97, PCM) for that platform. * Machine driver: The machine driver handles any machine specific controls and @@ -81,4 +83,4 @@ machine.txt: Machine driver internals. pop_clicks.txt: How to minimise audio artifacts. -clocking.txt: ASoC clocking for best power performance.
\ No newline at end of file +clocking.txt: ASoC clocking for best power performance. diff --git a/Documentation/sound/alsa/soc/platform.txt b/Documentation/sound/alsa/soc/platform.txt index d4678b4dc6c6..b681d17fc388 100644 --- a/Documentation/sound/alsa/soc/platform.txt +++ b/Documentation/sound/alsa/soc/platform.txt @@ -8,7 +8,7 @@ specific code. Audio DMA ========= -The platform DMA driver optionally supports the following alsa operations:- +The platform DMA driver optionally supports the following ALSA operations:- /* SoC audio ops */ struct snd_soc_ops { @@ -38,7 +38,7 @@ struct snd_soc_platform { struct snd_pcm_ops *pcm_ops; }; -Please refer to the alsa driver documentation for details of audio DMA. +Please refer to the ALSA driver documentation for details of audio DMA. http://www.alsa-project.org/~iwai/writing-an-alsa-driver/c436.htm An example DMA driver is soc/pxa/pxa2xx-pcm.c @@ -52,7 +52,7 @@ Each SoC DAI driver must provide the following features:- 1) Digital audio interface (DAI) description 2) Digital audio interface configuration 3) PCM's description - 4) Sysclk configuration + 4) SYSCLK configuration 5) Suspend and resume (optional) Please see codec.txt for a description of items 1 - 4. diff --git a/Documentation/sound/alsa/soc/pops_clicks.txt b/Documentation/sound/alsa/soc/pops_clicks.txt index 3371bd9d7cfa..e1e74daa4497 100644 --- a/Documentation/sound/alsa/soc/pops_clicks.txt +++ b/Documentation/sound/alsa/soc/pops_clicks.txt @@ -15,11 +15,11 @@ click every time a component power state is changed. Minimising Playback Pops and Clicks =================================== -Playback pops in portable audio subsystems cannot be completely eliminated atm, -however future audio codec hardware will have better pop and click suppression. -Pops can be reduced within playback by powering the audio components in a -specific order. This order is different for startup and shutdown and follows -some basic rules:- +Playback pops in portable audio subsystems cannot be completely eliminated +currently, however future audio codec hardware will have better pop and click +suppression. Pops can be reduced within playback by powering the audio +components in a specific order. This order is different for startup and +shutdown and follows some basic rules:- Startup Order :- DAC --> Mixers --> Output PGA --> Digital Unmute diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt index aa986a35e994..f99254327ae5 100644 --- a/Documentation/sysctl/fs.txt +++ b/Documentation/sysctl/fs.txt @@ -23,6 +23,7 @@ Currently, these files are in /proc/sys/fs: - inode-max - inode-nr - inode-state +- nr_open - overflowuid - overflowgid - suid_dumpable @@ -91,6 +92,15 @@ usage of file handles and you don't need to increase the maximum. ============================================================== +nr_open: + +This denotes the maximum number of file-handles a process can +allocate. Default value is 1024*1024 (1048576) which should be +enough for most machines. Actual limit depends on RLIMIT_NOFILE +resource limit. + +============================================================== + inode-max, inode-nr & inode-state: As with file handles, the kernel allocates the inode structures diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index b89570c30434..24eac1bc735d 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -22,6 +22,7 @@ Currently, these files are in /proc/sys/vm: - dirty_background_ratio - dirty_expire_centisecs - dirty_writeback_centisecs +- highmem_is_dirtyable (only if CONFIG_HIGHMEM set) - max_map_count - min_free_kbytes - laptop_mode @@ -34,13 +35,15 @@ Currently, these files are in /proc/sys/vm: - oom_kill_allocating_task - mmap_min_address - numa_zonelist_order +- nr_hugepages +- nr_overcommit_hugepages ============================================================== dirty_ratio, dirty_background_ratio, dirty_expire_centisecs, -dirty_writeback_centisecs, vfs_cache_pressure, laptop_mode, -block_dump, swap_token_timeout, drop-caches, -hugepages_treat_as_movable: +dirty_writeback_centisecs, highmem_is_dirtyable, +vfs_cache_pressure, laptop_mode, block_dump, swap_token_timeout, +drop-caches, hugepages_treat_as_movable: See Documentation/filesystems/proc.txt @@ -305,3 +308,20 @@ will select "node" order in following case. Otherwise, "zone" order will be selected. Default order is recommended unless this is causing problems for your system/application. + +============================================================== + +nr_hugepages + +Change the minimum size of the hugepage pool. + +See Documentation/vm/hugetlbpage.txt + +============================================================== + +nr_overcommit_hugepages + +Change the maximum size of the hugepage pool. The maximum is +nr_hugepages + nr_overcommit_hugepages. + +See Documentation/vm/hugetlbpage.txt diff --git a/Documentation/thinkpad-acpi.txt b/Documentation/thinkpad-acpi.txt index ec499265deca..6c2477754a2a 100644 --- a/Documentation/thinkpad-acpi.txt +++ b/Documentation/thinkpad-acpi.txt @@ -1,7 +1,7 @@ ThinkPad ACPI Extras Driver - Version 0.16 - August 2nd, 2007 + Version 0.19 + January 06th, 2008 Borislav Deianov <borislav@users.sf.net> Henrique de Moraes Holschuh <hmh@hmh.eng.br> @@ -215,6 +215,11 @@ The following commands can be written to the /proc/acpi/ibm/hotkey file: ... any other 8-hex-digit mask ... echo reset > /proc/acpi/ibm/hotkey -- restore the original mask +The procfs interface does not support NVRAM polling control. So as to +maintain maximum bug-to-bug compatibility, it does not report any masks, +nor does it allow one to manipulate the hot key mask when the firmware +does not support masks at all, even if NVRAM polling is in use. + sysfs notes: hotkey_bios_enabled: @@ -231,17 +236,26 @@ sysfs notes: to this value. hotkey_enable: - Enables/disables the hot keys feature, and reports - current status of the hot keys feature. + Enables/disables the hot keys feature in the ACPI + firmware, and reports current status of the hot keys + feature. Has no effect on the NVRAM hot key polling + functionality. 0: disables the hot keys feature / feature disabled 1: enables the hot keys feature / feature enabled hotkey_mask: - bit mask to enable driver-handling and ACPI event - generation for each hot key (see above). Returns the - current status of the hot keys mask, and allows one to - modify it. + bit mask to enable driver-handling (and depending on + the firmware, ACPI event generation) for each hot key + (see above). Returns the current status of the hot keys + mask, and allows one to modify it. + + Note: when NVRAM polling is active, the firmware mask + will be different from the value returned by + hotkey_mask. The driver will retain enabled bits for + hotkeys that are under NVRAM polling even if the + firmware refuses them, and will not set these bits on + the firmware hot key mask. hotkey_all_mask: bit mask that should enable event reporting for all @@ -257,12 +271,48 @@ sysfs notes: handled by the firmware anyway. Echo it to hotkey_mask above, to use. + hotkey_source_mask: + bit mask that selects which hot keys will the driver + poll the NVRAM for. This is auto-detected by the driver + based on the capabilities reported by the ACPI firmware, + but it can be overridden at runtime. + + Hot keys whose bits are set in both hotkey_source_mask + and also on hotkey_mask are polled for in NVRAM. Only a + few hot keys are available through CMOS NVRAM polling. + + Warning: when in NVRAM mode, the volume up/down/mute + keys are synthesized according to changes in the mixer, + so you have to use volume up or volume down to unmute, + as per the ThinkPad volume mixer user interface. When + in ACPI event mode, volume up/down/mute are reported as + separate events, but this behaviour may be corrected in + future releases of this driver, in which case the + ThinkPad volume mixer user interface semanthics will be + enforced. + + hotkey_poll_freq: + frequency in Hz for hot key polling. It must be between + 0 and 25 Hz. Polling is only carried out when strictly + needed. + + Setting hotkey_poll_freq to zero disables polling, and + will cause hot key presses that require NVRAM polling + to never be reported. + + Setting hotkey_poll_freq too low will cause repeated + pressings of the same hot key to be misreported as a + single key press, or to not even be detected at all. + The recommended polling frequency is 10Hz. + hotkey_radio_sw: if the ThinkPad has a hardware radio switch, this attribute will read 0 if the switch is in the "radios disabled" postition, and 1 if the switch is in the "radios enabled" position. + This attribute has poll()/select() support. + hotkey_report_mode: Returns the state of the procfs ACPI event report mode filter for hot keys. If it is set to 1 (the default), @@ -277,6 +327,25 @@ sysfs notes: May return -EPERM (write access locked out by module parameter) or -EACCES (read-only). + wakeup_reason: + Set to 1 if the system is waking up because the user + requested a bay ejection. Set to 2 if the system is + waking up because the user requested the system to + undock. Set to zero for normal wake-ups or wake-ups + due to unknown reasons. + + This attribute has poll()/select() support. + + wakeup_hotunplug_complete: + Set to 1 if the system was waken up because of an + undock or bay ejection request, and that request + was sucessfully completed. At this point, it might + be useful to send the system back to sleep, at the + user's choice. Refer to HKEY events 0x4003 and + 0x3003, below. + + This attribute has poll()/select() support. + input layer notes: A Hot key is mapped to a single input layer EV_KEY event, possibly @@ -427,6 +496,23 @@ Non hot-key ACPI HKEY event map: The above events are not propagated by the driver, except for legacy compatibility purposes when hotkey_report_mode is set to 1. +0x2304 System is waking up from suspend to undock +0x2305 System is waking up from suspend to eject bay +0x2404 System is waking up from hibernation to undock +0x2405 System is waking up from hibernation to eject bay + +The above events are never propagated by the driver. + +0x3003 Bay ejection (see 0x2x05) complete, can sleep again +0x4003 Undocked (see 0x2x04), can sleep again +0x5009 Tablet swivel: switched to tablet mode +0x500A Tablet swivel: switched to normal mode +0x500B Tablet pen insterted into its storage bay +0x500C Tablet pen removed from its storage bay +0x5010 Brightness level changed (newer Lenovo BIOSes) + +The above events are propagated by the driver. + Compatibility notes: ibm-acpi and thinkpad-acpi 0.15 (mainline kernels before 2.6.23) never @@ -923,19 +1009,34 @@ sysfs backlight device "thinkpad_screen" This feature allows software control of the LCD brightness on ThinkPad models which don't have a hardware brightness slider. -It has some limitations: the LCD backlight cannot be actually turned on or off -by this interface, and in many ThinkPad models, the "dim while on battery" -functionality will be enabled by the BIOS when this interface is used, and -cannot be controlled. - -The backlight control has eight levels, ranging from 0 to 7. Some of the -levels may not be distinct. - -There are two interfaces to the firmware for brightness control, EC and CMOS. -To select which one should be used, use the brightness_mode module parameter: -brightness_mode=1 selects EC mode, brightness_mode=2 selects CMOS mode, -brightness_mode=3 selects both EC and CMOS. The driver tries to autodetect -which interface to use. +It has some limitations: the LCD backlight cannot be actually turned on or +off by this interface, and in many ThinkPad models, the "dim while on +battery" functionality will be enabled by the BIOS when this interface is +used, and cannot be controlled. + +On IBM (and some of the earlier Lenovo) ThinkPads, the backlight control +has eight brightness levels, ranging from 0 to 7. Some of the levels +may not be distinct. Later Lenovo models that implement the ACPI +display backlight brightness control methods have 16 levels, ranging +from 0 to 15. + +There are two interfaces to the firmware for direct brightness control, +EC and CMOS. To select which one should be used, use the +brightness_mode module parameter: brightness_mode=1 selects EC mode, +brightness_mode=2 selects CMOS mode, brightness_mode=3 selects both EC +and CMOS. The driver tries to autodetect which interface to use. + +When display backlight brightness controls are available through the +standard ACPI interface, it is best to use it instead of this direct +ThinkPad-specific interface. The driver will disable its native +backlight brightness control interface if it detects that the standard +ACPI interface is available in the ThinkPad. + +The brightness_enable module parameter can be used to control whether +the LCD brightness control feature will be enabled when available. +brightness_enable=0 forces it to be disabled. brightness_enable=1 +forces it to be enabled when available, even if the standard ACPI +interface is also available. Procfs notes: @@ -947,11 +1048,11 @@ Procfs notes: Sysfs notes: -The interface is implemented through the backlight sysfs class, which is poorly -documented at this time. +The interface is implemented through the backlight sysfs class, which is +poorly documented at this time. -Locate the thinkpad_screen device under /sys/class/backlight, and inside it -there will be the following attributes: +Locate the thinkpad_screen device under /sys/class/backlight, and inside +it there will be the following attributes: max_brightness: Reads the maximum brightness the hardware can be set to. @@ -961,17 +1062,19 @@ there will be the following attributes: Reads what brightness the screen is set to at this instant. brightness: - Writes request the driver to change brightness to the given - value. Reads will tell you what brightness the driver is trying - to set the display to when "power" is set to zero and the display - has not been dimmed by a kernel power management event. + Writes request the driver to change brightness to the + given value. Reads will tell you what brightness the + driver is trying to set the display to when "power" is set + to zero and the display has not been dimmed by a kernel + power management event. power: - power management mode, where 0 is "display on", and 1 to 3 will - dim the display backlight to brightness level 0 because - thinkpad-acpi cannot really turn the backlight off. Kernel - power management events can temporarily increase the current - power management level, i.e. they can dim the display. + power management mode, where 0 is "display on", and 1 to 3 + will dim the display backlight to brightness level 0 + because thinkpad-acpi cannot really turn the backlight + off. Kernel power management events can temporarily + increase the current power management level, i.e. they can + dim the display. Volume control -- /proc/acpi/ibm/volume @@ -1246,3 +1349,17 @@ Sysfs interface changelog: and the hwmon class for libsensors4 (lm-sensors 3) compatibility. Moved all hwmon attributes to this new platform device. + +0x020100: Marker for thinkpad-acpi with hot key NVRAM polling + support. If you must, use it to know you should not + start an userspace NVRAM poller (allows to detect when + NVRAM is compiled out by the user because it is + unneeded/undesired in the first place). +0x020101: Marker for thinkpad-acpi with hot key NVRAM polling + and proper hotkey_mask semanthics (version 8 of the + NVRAM polling patch). Some development snapshots of + 0.18 had an earlier version that did strange things + to hotkey_mask. + +0x020200: Add poll()/select() support to the following attributes: + hotkey_radio_sw, wakeup_hotunplug_complete, wakeup_reason diff --git a/Documentation/tipar.txt b/Documentation/tipar.txt deleted file mode 100644 index 67133baef6ef..000000000000 --- a/Documentation/tipar.txt +++ /dev/null @@ -1,93 +0,0 @@ - - Parallel link cable for Texas Instruments handhelds - =================================================== - - -Author: Romain Lievin -Homepage: http://lpg.ticalc.org/prj_tidev/index.html - - -INTRODUCTION: - -This is a driver for the very common home-made parallel link cable, a cable -designed for connecting TI8x/9x graphing calculators (handhelds) to a computer -or workstation (Alpha, Sparc). Given that driver is built on parport, the -parallel port abstraction layer, this driver is architecture-independent. - -It can also be used with another device plugged on the same port (such as a -ZIP drive). I have a 100MB ZIP and both of them work fine! - -If you need more information, please visit the 'TI drivers' homepage at the URL -above. - -WHAT YOU NEED: - -A TI calculator and a program capable of communicating with your calculator. - -TiLP will work for sure (since I am its developer!). yal92 may be able to use -it by changing tidev for tipar (may require some hacking...). - -HOW TO USE IT: - -You must have first compiled parport support (CONFIG_PARPORT_DEV): either -compiled in your kernel, either as a module. - -Next, (as root): - - modprobe parport - modprobe tipar - -If it is not already there (it usually is), create the device: - - mknod /dev/tipar0 c 115 0 - mknod /dev/tipar1 c 115 1 - mknod /dev/tipar2 c 115 2 - -You will have to set permissions on this device to allow you to read/write -from it: - - chmod 666 /dev/tipar[0..2] - -Now you are ready to run a linking program such as TiLP. Be sure to configure -it properly (RTFM). - -MODULE PARAMETERS: - - You can set these with: modprobe tipar NAME=VALUE - There is currently no way to set these on a per-cable basis. - - NAME: timeout - TYPE: integer - DEFAULT: 15 - DESC: Timeout value in tenth of seconds. If no data is available once this - time has expired then the driver will return with a timeout error. - - NAME: delay - TYPE: integer - DEFAULT: 10 - DESC: Inter-bit delay in micro-seconds. A lower value gives an higher data - rate but makes transmission less reliable. - -These parameters can be changed at run time by any program via ioctl(2) calls -as listed in ./include/linux/ticable.h. - -Rather than write 50 pages describing the ioctl() and so on, it is -perhaps more useful you look at ticables library (dev_link.c) that demonstrates -how to use them, and demonstrates the features of the driver. This is -probably a lot more useful to people interested in writing applications -that will be using this driver. - -QUIRKS/BUGS: - -None. - -HOW TO CONTACT US: - -You can email me at roms@lpg.ticalc.org. Please prefix the subject line -with "TIPAR: " so that I am certain to notice your message. -You can also mail JB at jb@jblache.org. He packaged these drivers for Debian. - -CREDITS: - -The code is based on tidev.c & parport.c. -The driver has been developed independently of Texas Instruments. diff --git a/Documentation/tty.txt b/Documentation/tty.txt index 048a8762cfb5..8e65c4498c52 100644 --- a/Documentation/tty.txt +++ b/Documentation/tty.txt @@ -132,6 +132,14 @@ set_termios() Notify the tty driver that the device's termios tty->termios. Previous settings should be passed in the "old" argument. + The API is defined such that the driver should return + the actual modes selected. This means that the + driver function is responsible for modifying any + bits in the request it cannot fulfill to indicate + the actual modes being used. A device with no + hardware capability for change (eg a USB dongle or + virtual port) can provide NULL for this method. + throttle() Notify the tty driver that input buffers for the line discipline are close to full, and it should somehow signal that no more characters should be diff --git a/Documentation/unaligned-memory-access.txt b/Documentation/unaligned-memory-access.txt new file mode 100644 index 000000000000..6223eace3c09 --- /dev/null +++ b/Documentation/unaligned-memory-access.txt @@ -0,0 +1,226 @@ +UNALIGNED MEMORY ACCESSES +========================= + +Linux runs on a wide variety of architectures which have varying behaviour +when it comes to memory access. This document presents some details about +unaligned accesses, why you need to write code that doesn't cause them, +and how to write such code! + + +The definition of an unaligned access +===================================== + +Unaligned memory accesses occur when you try to read N bytes of data starting +from an address that is not evenly divisible by N (i.e. addr % N != 0). +For example, reading 4 bytes of data from address 0x10004 is fine, but +reading 4 bytes of data from address 0x10005 would be an unaligned memory +access. + +The above may seem a little vague, as memory access can happen in different +ways. The context here is at the machine code level: certain instructions read +or write a number of bytes to or from memory (e.g. movb, movw, movl in x86 +assembly). As will become clear, it is relatively easy to spot C statements +which will compile to multiple-byte memory access instructions, namely when +dealing with types such as u16, u32 and u64. + + +Natural alignment +================= + +The rule mentioned above forms what we refer to as natural alignment: +When accessing N bytes of memory, the base memory address must be evenly +divisible by N, i.e. addr % N == 0. + +When writing code, assume the target architecture has natural alignment +requirements. + +In reality, only a few architectures require natural alignment on all sizes +of memory access. However, we must consider ALL supported architectures; +writing code that satisfies natural alignment requirements is the easiest way +to achieve full portability. + + +Why unaligned access is bad +=========================== + +The effects of performing an unaligned memory access vary from architecture +to architecture. It would be easy to write a whole document on the differences +here; a summary of the common scenarios is presented below: + + - Some architectures are able to perform unaligned memory accesses + transparently, but there is usually a significant performance cost. + - Some architectures raise processor exceptions when unaligned accesses + happen. The exception handler is able to correct the unaligned access, + at significant cost to performance. + - Some architectures raise processor exceptions when unaligned accesses + happen, but the exceptions do not contain enough information for the + unaligned access to be corrected. + - Some architectures are not capable of unaligned memory access, but will + silently perform a different memory access to the one that was requested, + resulting a a subtle code bug that is hard to detect! + +It should be obvious from the above that if your code causes unaligned +memory accesses to happen, your code will not work correctly on certain +platforms and will cause performance problems on others. + + +Code that does not cause unaligned access +========================================= + +At first, the concepts above may seem a little hard to relate to actual +coding practice. After all, you don't have a great deal of control over +memory addresses of certain variables, etc. + +Fortunately things are not too complex, as in most cases, the compiler +ensures that things will work for you. For example, take the following +structure: + + struct foo { + u16 field1; + u32 field2; + u8 field3; + }; + +Let us assume that an instance of the above structure resides in memory +starting at address 0x10000. With a basic level of understanding, it would +not be unreasonable to expect that accessing field2 would cause an unaligned +access. You'd be expecting field2 to be located at offset 2 bytes into the +structure, i.e. address 0x10002, but that address is not evenly divisible +by 4 (remember, we're reading a 4 byte value here). + +Fortunately, the compiler understands the alignment constraints, so in the +above case it would insert 2 bytes of padding in between field1 and field2. +Therefore, for standard structure types you can always rely on the compiler +to pad structures so that accesses to fields are suitably aligned (assuming +you do not cast the field to a type of different length). + +Similarly, you can also rely on the compiler to align variables and function +parameters to a naturally aligned scheme, based on the size of the type of +the variable. + +At this point, it should be clear that accessing a single byte (u8 or char) +will never cause an unaligned access, because all memory addresses are evenly +divisible by one. + +On a related topic, with the above considerations in mind you may observe +that you could reorder the fields in the structure in order to place fields +where padding would otherwise be inserted, and hence reduce the overall +resident memory size of structure instances. The optimal layout of the +above example is: + + struct foo { + u32 field2; + u16 field1; + u8 field3; + }; + +For a natural alignment scheme, the compiler would only have to add a single +byte of padding at the end of the structure. This padding is added in order +to satisfy alignment constraints for arrays of these structures. + +Another point worth mentioning is the use of __attribute__((packed)) on a +structure type. This GCC-specific attribute tells the compiler never to +insert any padding within structures, useful when you want to use a C struct +to represent some data that comes in a fixed arrangement 'off the wire'. + +You might be inclined to believe that usage of this attribute can easily +lead to unaligned accesses when accessing fields that do not satisfy +architectural alignment requirements. However, again, the compiler is aware +of the alignment constraints and will generate extra instructions to perform +the memory access in a way that does not cause unaligned access. Of course, +the extra instructions obviously cause a loss in performance compared to the +non-packed case, so the packed attribute should only be used when avoiding +structure padding is of importance. + + +Code that causes unaligned access +================================= + +With the above in mind, let's move onto a real life example of a function +that can cause an unaligned memory access. The following function adapted +from include/linux/etherdevice.h is an optimized routine to compare two +ethernet MAC addresses for equality. + +unsigned int compare_ether_addr(const u8 *addr1, const u8 *addr2) +{ + const u16 *a = (const u16 *) addr1; + const u16 *b = (const u16 *) addr2; + return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) != 0; +} + +In the above function, the reference to a[0] causes 2 bytes (16 bits) to +be read from memory starting at address addr1. Think about what would happen +if addr1 was an odd address such as 0x10003. (Hint: it'd be an unaligned +access.) + +Despite the potential unaligned access problems with the above function, it +is included in the kernel anyway but is understood to only work on +16-bit-aligned addresses. It is up to the caller to ensure this alignment or +not use this function at all. This alignment-unsafe function is still useful +as it is a decent optimization for the cases when you can ensure alignment, +which is true almost all of the time in ethernet networking context. + + +Here is another example of some code that could cause unaligned accesses: + void myfunc(u8 *data, u32 value) + { + [...] + *((u32 *) data) = cpu_to_le32(value); + [...] + } + +This code will cause unaligned accesses every time the data parameter points +to an address that is not evenly divisible by 4. + +In summary, the 2 main scenarios where you may run into unaligned access +problems involve: + 1. Casting variables to types of different lengths + 2. Pointer arithmetic followed by access to at least 2 bytes of data + + +Avoiding unaligned accesses +=========================== + +The easiest way to avoid unaligned access is to use the get_unaligned() and +put_unaligned() macros provided by the <asm/unaligned.h> header file. + +Going back to an earlier example of code that potentially causes unaligned +access: + + void myfunc(u8 *data, u32 value) + { + [...] + *((u32 *) data) = cpu_to_le32(value); + [...] + } + +To avoid the unaligned memory access, you would rewrite it as follows: + + void myfunc(u8 *data, u32 value) + { + [...] + value = cpu_to_le32(value); + put_unaligned(value, (u32 *) data); + [...] + } + +The get_unaligned() macro works similarly. Assuming 'data' is a pointer to +memory and you wish to avoid unaligned access, its usage is as follows: + + u32 value = get_unaligned((u32 *) data); + +These macros work work for memory accesses of any length (not just 32 bits as +in the examples above). Be aware that when compared to standard access of +aligned memory, using these macros to access unaligned memory can be costly in +terms of performance. + +If use of such macros is not convenient, another option is to use memcpy(), +where the source or destination (or both) are of type u8* or unsigned char*. +Due to the byte-wise nature of this operation, unaligned accesses are avoided. + +-- +Author: Daniel Drake <dsd@gentoo.org> +With help from: Alan Cox, Avuton Olrich, Heikki Orsila, Jan Engelhardt, +Johannes Berg, Kyle McMartin, Kyle Moffett, Randy Dunlap, Robert Hancock, +Uli Kunitz, Vadim Lobanov + diff --git a/Documentation/usb/gadget_printer.txt b/Documentation/usb/gadget_printer.txt new file mode 100644 index 000000000000..ad995bf0db41 --- /dev/null +++ b/Documentation/usb/gadget_printer.txt @@ -0,0 +1,510 @@ + + Linux USB Printer Gadget Driver + 06/04/2007 + + Copyright (C) 2007 Craig W. Nadler <craig@nadler.us> + + + +GENERAL +======= + +This driver may be used if you are writing printer firmware using Linux as +the embedded OS. This driver has nothing to do with using a printer with +your Linux host system. + +You will need a USB device controller and a Linux driver for it that accepts +a gadget / "device class" driver using the Linux USB Gadget API. After the +USB device controller driver is loaded then load the printer gadget driver. +This will present a printer interface to the USB Host that your USB Device +port is connected to. + +This driver is structured for printer firmware that runs in user mode. The +user mode printer firmware will read and write data from the kernel mode +printer gadget driver using a device file. The printer returns a printer status +byte when the USB HOST sends a device request to get the printer status. The +user space firmware can read or write this status byte using a device file +/dev/g_printer . Both blocking and non-blocking read/write calls are supported. + + + + +HOWTO USE THIS DRIVER +===================== + +To load the USB device controller driver and the printer gadget driver. The +following example uses the Netchip 2280 USB device controller driver: + +modprobe net2280 +modprobe g_printer + + +The follow command line parameter can be used when loading the printer gadget +(ex: modprobe g_printer idVendor=0x0525 idProduct=0xa4a8 ): + +idVendor - This is the Vendor ID used in the device descriptor. The default is + the Netchip vendor id 0x0525. YOU MUST CHANGE TO YOUR OWN VENDOR ID + BEFORE RELEASING A PRODUCT. If you plan to release a product and don't + already have a Vendor ID please see www.usb.org for details on how to + get one. + +idProduct - This is the Product ID used in the device descriptor. The default + is 0xa4a8, you should change this to an ID that's not used by any of + your other USB products if you have any. It would be a good idea to + start numbering your products starting with say 0x0001. + +bcdDevice - This is the version number of your product. It would be a good idea + to put your firmware version here. + +iManufacturer - A string containing the name of the Vendor. + +iProduct - A string containing the Product Name. + +iSerialNum - A string containing the Serial Number. This should be changed for + each unit of your product. + +iPNPstring - The PNP ID string used for this printer. You will want to set + either on the command line or hard code the PNP ID string used for + your printer product. + +qlen - The number of 8k buffers to use per endpoint. The default is 10, you + should tune this for your product. You may also want to tune the + size of each buffer for your product. + + + + +USING THE EXAMPLE CODE +====================== + +This example code talks to stdout, instead of a print engine. + +To compile the test code below: + +1) save it to a file called prn_example.c +2) compile the code with the follow command: + gcc prn_example.c -o prn_example + + + +To read printer data from the host to stdout: + + # prn_example -read_data + + +To write printer data from a file (data_file) to the host: + + # cat data_file | prn_example -write_data + + +To get the current printer status for the gadget driver: + + # prn_example -get_status + + Printer status is: + Printer is NOT Selected + Paper is Out + Printer OK + + +To set printer to Selected/On-line: + + # prn_example -selected + + +To set printer to Not Selected/Off-line: + + # prn_example -not_selected + + +To set paper status to paper out: + + # prn_example -paper_out + + +To set paper status to paper loaded: + + # prn_example -paper_loaded + + +To set error status to printer OK: + + # prn_example -no_error + + +To set error status to ERROR: + + # prn_example -error + + + + +EXAMPLE CODE +============ + + +#include <stdio.h> +#include <stdlib.h> +#include <fcntl.h> +#include <linux/poll.h> +#include <sys/ioctl.h> +#include <linux/usb/g_printer.h> + +#define PRINTER_FILE "/dev/g_printer" +#define BUF_SIZE 512 + + +/* + * 'usage()' - Show program usage. + */ + +static void +usage(const char *option) /* I - Option string or NULL */ +{ + if (option) { + fprintf(stderr,"prn_example: Unknown option \"%s\"!\n", + option); + } + + fputs("\n", stderr); + fputs("Usage: prn_example -[options]\n", stderr); + fputs("Options:\n", stderr); + fputs("\n", stderr); + fputs("-get_status Get the current printer status.\n", stderr); + fputs("-selected Set the selected status to selected.\n", stderr); + fputs("-not_selected Set the selected status to NOT selected.\n", + stderr); + fputs("-error Set the error status to error.\n", stderr); + fputs("-no_error Set the error status to NO error.\n", stderr); + fputs("-paper_out Set the paper status to paper out.\n", stderr); + fputs("-paper_loaded Set the paper status to paper loaded.\n", + stderr); + fputs("-read_data Read printer data from driver.\n", stderr); + fputs("-write_data Write printer sata to driver.\n", stderr); + fputs("-NB_read_data (Non-Blocking) Read printer data from driver.\n", + stderr); + fputs("\n\n", stderr); + + exit(1); +} + + +static int +read_printer_data() +{ + struct pollfd fd[1]; + + /* Open device file for printer gadget. */ + fd[0].fd = open(PRINTER_FILE, O_RDWR); + if (fd[0].fd < 0) { + printf("Error %d opening %s\n", fd[0].fd, PRINTER_FILE); + close(fd[0].fd); + return(-1); + } + + fd[0].events = POLLIN | POLLRDNORM; + + while (1) { + static char buf[BUF_SIZE]; + int bytes_read; + int retval; + + /* Wait for up to 1 second for data. */ + retval = poll(fd, 1, 1000); + + if (retval && (fd[0].revents & POLLRDNORM)) { + + /* Read data from printer gadget driver. */ + bytes_read = read(fd[0].fd, buf, BUF_SIZE); + + if (bytes_read < 0) { + printf("Error %d reading from %s\n", + fd[0].fd, PRINTER_FILE); + close(fd[0].fd); + return(-1); + } else if (bytes_read > 0) { + /* Write data to standard OUTPUT (stdout). */ + fwrite(buf, 1, bytes_read, stdout); + fflush(stdout); + } + + } + + } + + /* Close the device file. */ + close(fd[0].fd); + + return 0; +} + + +static int +write_printer_data() +{ + struct pollfd fd[1]; + + /* Open device file for printer gadget. */ + fd[0].fd = open (PRINTER_FILE, O_RDWR); + if (fd[0].fd < 0) { + printf("Error %d opening %s\n", fd[0].fd, PRINTER_FILE); + close(fd[0].fd); + return(-1); + } + + fd[0].events = POLLOUT | POLLWRNORM; + + while (1) { + int retval; + static char buf[BUF_SIZE]; + /* Read data from standard INPUT (stdin). */ + int bytes_read = fread(buf, 1, BUF_SIZE, stdin); + + if (!bytes_read) { + break; + } + + while (bytes_read) { + + /* Wait for up to 1 second to sent data. */ + retval = poll(fd, 1, 1000); + + /* Write data to printer gadget driver. */ + if (retval && (fd[0].revents & POLLWRNORM)) { + retval = write(fd[0].fd, buf, bytes_read); + if (retval < 0) { + printf("Error %d writing to %s\n", + fd[0].fd, + PRINTER_FILE); + close(fd[0].fd); + return(-1); + } else { + bytes_read -= retval; + } + + } + + } + + } + + /* Wait until the data has been sent. */ + fsync(fd[0].fd); + + /* Close the device file. */ + close(fd[0].fd); + + return 0; +} + + +static int +read_NB_printer_data() +{ + int fd; + static char buf[BUF_SIZE]; + int bytes_read; + + /* Open device file for printer gadget. */ + fd = open(PRINTER_FILE, O_RDWR|O_NONBLOCK); + if (fd < 0) { + printf("Error %d opening %s\n", fd, PRINTER_FILE); + close(fd); + return(-1); + } + + while (1) { + /* Read data from printer gadget driver. */ + bytes_read = read(fd, buf, BUF_SIZE); + if (bytes_read <= 0) { + break; + } + + /* Write data to standard OUTPUT (stdout). */ + fwrite(buf, 1, bytes_read, stdout); + fflush(stdout); + } + + /* Close the device file. */ + close(fd); + + return 0; +} + + +static int +get_printer_status() +{ + int retval; + int fd; + + /* Open device file for printer gadget. */ + fd = open(PRINTER_FILE, O_RDWR); + if (fd < 0) { + printf("Error %d opening %s\n", fd, PRINTER_FILE); + close(fd); + return(-1); + } + + /* Make the IOCTL call. */ + retval = ioctl(fd, GADGET_GET_PRINTER_STATUS); + if (retval < 0) { + fprintf(stderr, "ERROR: Failed to set printer status\n"); + return(-1); + } + + /* Close the device file. */ + close(fd); + + return(retval); +} + + +static int +set_printer_status(unsigned char buf, int clear_printer_status_bit) +{ + int retval; + int fd; + + retval = get_printer_status(); + if (retval < 0) { + fprintf(stderr, "ERROR: Failed to get printer status\n"); + return(-1); + } + + /* Open device file for printer gadget. */ + fd = open(PRINTER_FILE, O_RDWR); + + if (fd < 0) { + printf("Error %d opening %s\n", fd, PRINTER_FILE); + close(fd); + return(-1); + } + + if (clear_printer_status_bit) { + retval &= ~buf; + } else { + retval |= buf; + } + + /* Make the IOCTL call. */ + if (ioctl(fd, GADGET_SET_PRINTER_STATUS, (unsigned char)retval)) { + fprintf(stderr, "ERROR: Failed to set printer status\n"); + return(-1); + } + + /* Close the device file. */ + close(fd); + + return 0; +} + + +static int +display_printer_status() +{ + char printer_status; + + printer_status = get_printer_status(); + if (printer_status < 0) { + fprintf(stderr, "ERROR: Failed to get printer status\n"); + return(-1); + } + + printf("Printer status is:\n"); + if (printer_status & PRINTER_SELECTED) { + printf(" Printer is Selected\n"); + } else { + printf(" Printer is NOT Selected\n"); + } + if (printer_status & PRINTER_PAPER_EMPTY) { + printf(" Paper is Out\n"); + } else { + printf(" Paper is Loaded\n"); + } + if (printer_status & PRINTER_NOT_ERROR) { + printf(" Printer OK\n"); + } else { + printf(" Printer ERROR\n"); + } + + return(0); +} + + +int +main(int argc, char *argv[]) +{ + int i; /* Looping var */ + int retval = 0; + + /* No Args */ + if (argc == 1) { + usage(0); + exit(0); + } + + for (i = 1; i < argc && !retval; i ++) { + + if (argv[i][0] != '-') { + continue; + } + + if (!strcmp(argv[i], "-get_status")) { + if (display_printer_status()) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-paper_loaded")) { + if (set_printer_status(PRINTER_PAPER_EMPTY, 1)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-paper_out")) { + if (set_printer_status(PRINTER_PAPER_EMPTY, 0)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-selected")) { + if (set_printer_status(PRINTER_SELECTED, 0)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-not_selected")) { + if (set_printer_status(PRINTER_SELECTED, 1)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-error")) { + if (set_printer_status(PRINTER_NOT_ERROR, 1)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-no_error")) { + if (set_printer_status(PRINTER_NOT_ERROR, 0)) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-read_data")) { + if (read_printer_data()) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-write_data")) { + if (write_printer_data()) { + retval = 1; + } + + } else if (!strcmp(argv[i], "-NB_read_data")) { + if (read_NB_printer_data()) { + retval = 1; + } + + } else { + usage(argv[i]); + retval = 1; + } + } + + exit(retval); +} diff --git a/Documentation/usb/iuu_phoenix.txt b/Documentation/usb/iuu_phoenix.txt new file mode 100644 index 000000000000..e5f048067da4 --- /dev/null +++ b/Documentation/usb/iuu_phoenix.txt @@ -0,0 +1,84 @@ +Infinity Usb Unlimited Readme +----------------------------- + +Hi all, + + +This module provide a serial interface to use your +IUU unit in phoenix mode. Loading this module will +bring a ttyUSB[0-x] interface. This driver must be +used by your favorite application to pilot the IUU + +This driver is still in beta stage, so bugs can +occur and your system may freeze. As far I now, +I never had any problem with it, but I'm not a real +guru, so don't blame me if your system is unstable + +You can plug more than one IUU. Every unit will +have his own device file(/dev/ttyUSB0,/dev/ttyUSB1,...) + + + +How to tune the reader speed ? + + A few parameters can be used at load time + To use parameters, just unload the module if it is + already loaded and use modprobe iuu_phoenix param=value. + In case of prebuilt module, use the command + insmod iuu_phoenix param=value. + + Example: + + modprobe iuu_phoenix clockmode=3 + + The parameters are: + + parm: clockmode:1=3Mhz579,2=3Mhz680,3=6Mhz (int) + parm: boost:overclock boost percent 100 to 500 (int) + parm: cdmode:Card detect mode 0=none, 1=CD, 2=!CD, 3=DSR, 4=!DSR, 5=CTS, 6=!CTS, 7=RING, 8=!RING (int) + parm: xmas:xmas color enabled or not (bool) + parm: debug:Debug enabled or not (bool) + +- clockmode will provide 3 different base settings commonly adopted by + different software: + 1. 3Mhz579 + 2. 3Mhz680 + 3. 6Mhz + +- boost provide a way to overclock the reader ( my favorite :-) ) + For example to have best performance than a simple clockmode=3, try this: + + modprobe boost=195 + + This will put the reader in a base of 3Mhz579 but boosted a 195 % ! + the real clock will be now : 6979050 Hz ( 6Mhz979 ) and will increase + the speed to a score 10 to 20% better than the simple clockmode=3 !!! + + +- cdmode permit to setup the signal used to inform the userland ( ioctl answer ) + if the card is present or not. Eight signals are possible. + +- xmas is completely useless except for your eyes. This is one of my friend who was + so sad to have a nice device like the iuu without seeing all color range available. + So I have added this option to permit him to see a lot of color ( each activity change the color + and the frequency randomly ) + +- debug will produce a lot of debugging messages... + + + Last notes: + + Don't worry about the serial settings, the serial emulation + is an abstraction, so use any speed or parity setting will + work. ( This will not change anything ).Later I will perhaps + use this settings to deduce de boost but is that feature + really necessary ? + The autodetect feature used is the serial CD. If that doesn't + work for your software, disable detection mechanism in it. + + + Have fun ! + + Alain Degreffe + + eczema(at)ecze.com diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt index 97842deec471..b2fc4d4a9917 100644 --- a/Documentation/usb/power-management.txt +++ b/Documentation/usb/power-management.txt @@ -278,6 +278,14 @@ optional. The methods' jobs are quite simple: (although the interfaces will be in the same altsettings as before the suspend). +If the device is disconnected or powered down while it is suspended, +the disconnect method will be called instead of the resume or +reset_resume method. This is also quite likely to happen when +waking up from hibernation, as many systems do not maintain suspend +current to the USB host controllers during hibernation. (It's +possible to work around the hibernation-forces-disconnect problem by +using the USB Persist facility.) + The reset_resume method is used by the USB Persist facility (see Documentation/usb/persist.txt) and it can also be used under certain circumstances when CONFIG_USB_PERSIST is not enabled. Currently, if a diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 00cb646a4bde..0924e6e142c4 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -1,5 +1,7 @@ 0 -> UNKNOWN/GENERIC [0070:3400] 1 -> Hauppauge WinTV-HVR1800lp [0070:7600] - 2 -> Hauppauge WinTV-HVR1800 [0070:7800,0070:7801] + 2 -> Hauppauge WinTV-HVR1800 [0070:7800,0070:7801,0070:7809] 3 -> Hauppauge WinTV-HVR1250 [0070:7911] 4 -> DViCO FusionHDTV5 Express [18ac:d500] + 5 -> Hauppauge WinTV-HVR1500Q [0070:7790,0070:7797] + 6 -> Hauppauge WinTV-HVR1500 [0070:7710,0070:7717] diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 82ac8250e978..bc5593bd9704 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -56,3 +56,4 @@ 55 -> Shenzhen Tungsten Ages Tech TE-DTV-250 / Swann OEM [c180:c980] 56 -> Hauppauge WinTV-HVR1300 DVB-T/Hybrid MPEG Encoder [0070:9600,0070:9601,0070:9602] 57 -> ADS Tech Instant Video PCI [1421:0390] + 58 -> Pinnacle PCTV HD 800i [11bd:0051] diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 37f0e3cedf43..6a8469f2bcae 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -1,14 +1,17 @@ 0 -> Unknown EM2800 video grabber (em2800) [eb1a:2800] - 1 -> Unknown EM2820/2840 video grabber (em2820/em2840) + 1 -> Unknown EM2750/28xx video grabber (em2820/em2840) [eb1a:2750,eb1a:2820,eb1a:2821,eb1a:2860,eb1a:2861,eb1a:2870,eb1a:2881,eb1a:2883] 2 -> Terratec Cinergy 250 USB (em2820/em2840) [0ccd:0036] 3 -> Pinnacle PCTV USB 2 (em2820/em2840) [2304:0208] - 4 -> Hauppauge WinTV USB 2 (em2820/em2840) [2040:4200] - 5 -> MSI VOX USB 2.0 (em2820/em2840) [eb1a:2820] + 4 -> Hauppauge WinTV USB 2 (em2820/em2840) [2040:4200,2040:4201] + 5 -> MSI VOX USB 2.0 (em2820/em2840) 6 -> Terratec Cinergy 200 USB (em2800) 7 -> Leadtek Winfast USB II (em2800) 8 -> Kworld USB2800 (em2800) - 9 -> Pinnacle Dazzle DVC 90 (em2820/em2840) [2304:0207] - 10 -> Hauppauge WinTV HVR 900 (em2880) - 11 -> Terratec Hybrid XS (em2880) + 9 -> Pinnacle Dazzle DVC 90/DVC 100 (em2820/em2840) [2304:0207,2304:021a] + 10 -> Hauppauge WinTV HVR 900 (em2880) [2040:6500] + 11 -> Terratec Hybrid XS (em2880) [0ccd:0042] 12 -> Kworld PVR TV 2800 RF (em2820/em2840) - 13 -> Terratec Prodigy XS (em2880) + 13 -> Terratec Prodigy XS (em2880) [0ccd:0047] + 14 -> Pixelview Prolink PlayTV USB 2.0 (em2820/em2840) + 15 -> V-Gear PocketTV (em2800) + 16 -> Hauppauge WinTV HVR 950 (em2880) [2040:6513] diff --git a/Documentation/video4linux/CARDLIST.ivtv b/Documentation/video4linux/CARDLIST.ivtv index ddd76a0eb100..a019e27e42b3 100644 --- a/Documentation/video4linux/CARDLIST.ivtv +++ b/Documentation/video4linux/CARDLIST.ivtv @@ -16,3 +16,9 @@ 16 -> GOTVIEW PCI DVD2 Deluxe [ffac:0600] 17 -> Yuan MPC622 [ff01:d998] 18 -> Digital Cowboy DCT-MTVP1 [1461:bfff] +19 -> Yuan PG600V2/GotView PCI DVD Lite [ffab:0600,ffad:0600] +20 -> Club3D ZAP-TV1x01 [ffab:0600] +21 -> AverTV MCE 116 Plus [1461:c439] +22 -> ASUS Falcon2 [1043:4b66,1043:462e,1043:4b2e] +23 -> AverMedia PVR-150 Plus [1461:c035] +24 -> AverMedia EZMaker PCI Deluxe [1461:c03f] diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index a14545300e4c..5d3b6b4d2515 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -80,7 +80,7 @@ 79 -> Sedna/MuchTV PC TV Cardbus TV/Radio (ITO25 Rev:2B) 80 -> ASUS Digimatrix TV [1043:0210] 81 -> Philips Tiger reference design [1131:2018] - 82 -> MSI TV@Anywhere plus [1462:6231] + 82 -> MSI TV@Anywhere plus [1462:6231,1462:8624] 83 -> Terratec Cinergy 250 PCI TV [153b:1160] 84 -> LifeView FlyDVB Trio [5168:0319] 85 -> AverTV DVB-T 777 [1461:2c05,1461:2c05] @@ -102,7 +102,7 @@ 101 -> Pinnacle PCTV 310i [11bd:002f] 102 -> Avermedia AVerTV Studio 507 [1461:9715] 103 -> Compro Videomate DVB-T200A -104 -> Hauppauge WinTV-HVR1110 DVB-T/Hybrid [0070:6701] +104 -> Hauppauge WinTV-HVR1110 DVB-T/Hybrid [0070:6700,0070:6701,0070:6702,0070:6703,0070:6704,0070:6705] 105 -> Terratec Cinergy HT PCMCIA [153b:1172] 106 -> Encore ENLTV [1131:2342,1131:2341,3016:2344] 107 -> Encore ENLTV-FM [1131:230f] @@ -116,3 +116,16 @@ 115 -> Sabrent PCMCIA TV-PCB05 [0919:2003] 116 -> 10MOONS TM300 TV Card [1131:2304] 117 -> Avermedia Super 007 [1461:f01d] +118 -> Beholder BeholdTV 401 [0000:4016] +119 -> Beholder BeholdTV 403 [0000:4036] +120 -> Beholder BeholdTV 403 FM [0000:4037] +121 -> Beholder BeholdTV 405 [0000:4050] +122 -> Beholder BeholdTV 405 FM [0000:4051] +123 -> Beholder BeholdTV 407 [0000:4070] +124 -> Beholder BeholdTV 407 FM [0000:4071] +125 -> Beholder BeholdTV 409 [0000:4090] +126 -> Beholder BeholdTV 505 FM/RDS [0000:5051,0000:505B,5ace:5050] +127 -> Beholder BeholdTV 507 FM/RDS / BeholdTV 509 FM [0000:5071,0000:507B,5ace:5070,5ace:5090] +128 -> Beholder BeholdTV Columbus TVFM [0000:5201] +129 -> Beholder BeholdTV 607 / BeholdTV 609 [5ace:6070,5ace:6071,5ace:6072,5ace:6073,5ace:6090,5ace:6091,5ace:6092,5ace:6093] +130 -> Beholder BeholdTV M6 / BeholdTV M6 Extra [5ace:6190,5ace:6193] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index a88c02d23805..0e2394695bb8 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -52,7 +52,7 @@ tuner=50 - TCL 2002N tuner=51 - Philips PAL/SECAM_D (FM 1256 I-H3) tuner=52 - Thomson DTT 7610 (ATSC/NTSC) tuner=53 - Philips FQ1286 -tuner=54 - tda8290+75 +tuner=54 - Philips/NXP TDA 8290/8295 + 8275/8275A/18271 tuner=55 - TCL 2002MB tuner=56 - Philips PAL/SECAM multi (FQ1216AME MK4) tuner=57 - Philips FQ1236A MK4 @@ -69,7 +69,8 @@ tuner=67 - Philips TD1316 Hybrid Tuner tuner=68 - Philips TUV1236D ATSC/NTSC dual in tuner=69 - Tena TNF 5335 and similar models tuner=70 - Samsung TCPN 2121P30A -tuner=71 - Xceive xc3028 +tuner=71 - Xceive xc2028/xc3028 tuner tuner=72 - Thomson FE6600 tuner=73 - Samsung TCPG 6121P30A tuner=75 - Philips TEA5761 FM Radio +tuner=76 - Xceive 5000 tuner diff --git a/Documentation/video4linux/CARDLIST.usbvision b/Documentation/video4linux/CARDLIST.usbvision index 3d6850ef0245..0b72d3fee17e 100644 --- a/Documentation/video4linux/CARDLIST.usbvision +++ b/Documentation/video4linux/CARDLIST.usbvision @@ -62,3 +62,4 @@ 61 -> Pinnacle Studio Linx Video input cable (PAL) [2304:0301] 62 -> Pinnacle PCTV Bungee USB (PAL) FM [2304:0419] 63 -> Hauppauge WinTv-USB [2400:4200] + 64 -> Pinnacle Studio PCTV USB (NTSC) FM V3 [2304:0113] diff --git a/Documentation/video4linux/extract_xc3028.pl b/Documentation/video4linux/extract_xc3028.pl new file mode 100644 index 000000000000..cced8ac5c543 --- /dev/null +++ b/Documentation/video4linux/extract_xc3028.pl @@ -0,0 +1,926 @@ +#!/usr/bin/perl + +# Copyright (c) Mauro Carvalho Chehab <mchehab@infradead.org> +# Released under GPLv2 +# +# In order to use, you need to: +# 1) Download the windows driver with something like: +# wget http://www.steventoth.net/linux/xc5000/HVR-12x0-14x0-17x0_1_25_25271_WHQL.zip +# 2) Extract the file hcw85bda.sys from the zip into the current dir: +# unzip -j HVR-12x0-14x0-17x0_1_25_25271_WHQL.zip Driver85/hcw85bda.sys +# 3) run the script: +# ./extract_xc3028.pl +# 4) copy the generated file: +# cp xc3028-v27.fw /lib/firmware + +#use strict; +use IO::Handle; + +my $debug=0; + +sub verify ($$) +{ + my ($filename, $hash) = @_; + my ($testhash); + + if (system("which md5sum > /dev/null 2>&1")) { + die "This firmware requires the md5sum command - see http://www.gnu.org/software/coreutils/\n"; + } + + open(CMD, "md5sum ".$filename."|"); + $testhash = <CMD>; + $testhash =~ /([a-zA-Z0-9]*)/; + $testhash = $1; + close CMD; + die "Hash of extracted file does not match (found $testhash, expected $hash!\n" if ($testhash ne $hash); +} + +sub get_hunk ($$) +{ + my ($offset, $length) = @_; + my ($chunklength, $buf, $rcount, $out); + + sysseek(INFILE, $offset, SEEK_SET); + while ($length > 0) { + # Calc chunk size + $chunklength = 2048; + $chunklength = $length if ($chunklength > $length); + + $rcount = sysread(INFILE, $buf, $chunklength); + die "Ran out of data\n" if ($rcount != $chunklength); + $out .= $buf; + $length -= $rcount; + } + return $out; +} + +sub write_le16($) +{ + my $val = shift; + my $msb = ($val >> 8) &0xff; + my $lsb = $val & 0xff; + + syswrite(OUTFILE, chr($lsb).chr($msb)); +} + +sub write_le32($) +{ + my $val = shift; + my $l3 = ($val >> 24) & 0xff; + my $l2 = ($val >> 16) & 0xff; + my $l1 = ($val >> 8) & 0xff; + my $l0 = $val & 0xff; + + syswrite(OUTFILE, chr($l0).chr($l1).chr($l2).chr($l3)); +} + +sub write_le64($$) +{ + my $msb_val = shift; + my $lsb_val = shift; + my $l7 = ($msb_val >> 24) & 0xff; + my $l6 = ($msb_val >> 16) & 0xff; + my $l5 = ($msb_val >> 8) & 0xff; + my $l4 = $msb_val & 0xff; + + my $l3 = ($lsb_val >> 24) & 0xff; + my $l2 = ($lsb_val >> 16) & 0xff; + my $l1 = ($lsb_val >> 8) & 0xff; + my $l0 = $lsb_val & 0xff; + + syswrite(OUTFILE, + chr($l0).chr($l1).chr($l2).chr($l3). + chr($l4).chr($l5).chr($l6).chr($l7)); +} + +sub write_hunk($$) +{ + my ($offset, $length) = @_; + my $out = get_hunk($offset, $length); + + printf "(len %d) ",$length if ($debug); + + for (my $i=0;$i<$length;$i++) { + printf "%02x ",ord(substr($out,$i,1)) if ($debug); + } + printf "\n" if ($debug); + + syswrite(OUTFILE, $out); +} + +sub write_hunk_fix_endian($$) +{ + my ($offset, $length) = @_; + my $out = get_hunk($offset, $length); + + printf "(len_fix %d) ",$length if ($debug); + + for (my $i=0;$i<$length;$i++) { + printf "%02x ",ord(substr($out,$i,1)) if ($debug); + } + printf "\n" if ($debug); + + my $i=0; + while ($i<$length) { + my $size = ord(substr($out,$i,1))*256+ord(substr($out,$i+1,1)); + syswrite(OUTFILE, substr($out,$i+1,1)); + syswrite(OUTFILE, substr($out,$i,1)); + $i+=2; + if ($size>0 && $size <0x8000) { + for (my $j=0;$j<$size;$j++) { + syswrite(OUTFILE, substr($out,$j+$i,1)); + } + $i+=$size; + } + } +} + +sub main_firmware($$$$) +{ + my $out; + my $j=0; + my $outfile = shift; + my $name = shift; + my $version = shift; + my $nr_desc = shift; + + for ($j = length($name); $j <32; $j++) { + $name = $name.chr(0); +} + + open OUTFILE, ">$outfile"; + syswrite(OUTFILE, $name); + write_le16($version); + write_le16($nr_desc); + + # + # Firmware 0, type: BASE FW F8MHZ (0x00000003), id: (0000000000000000), size: 8718 + # + + write_le32(0x00000003); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8718); # Size + write_hunk_fix_endian(813432, 8718); + + # + # Firmware 1, type: BASE FW F8MHZ MTS (0x00000007), id: (0000000000000000), size: 8712 + # + + write_le32(0x00000007); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8712); # Size + write_hunk_fix_endian(822152, 8712); + + # + # Firmware 2, type: BASE FW FM (0x00000401), id: (0000000000000000), size: 8562 + # + + write_le32(0x00000401); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8562); # Size + write_hunk_fix_endian(830872, 8562); + + # + # Firmware 3, type: BASE FW FM INPUT1 (0x00000c01), id: (0000000000000000), size: 8576 + # + + write_le32(0x00000c01); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8576); # Size + write_hunk_fix_endian(839440, 8576); + + # + # Firmware 4, type: BASE FW (0x00000001), id: (0000000000000000), size: 8706 + # + + write_le32(0x00000001); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8706); # Size + write_hunk_fix_endian(848024, 8706); + + # + # Firmware 5, type: BASE FW MTS (0x00000005), id: (0000000000000000), size: 8682 + # + + write_le32(0x00000005); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(8682); # Size + write_hunk_fix_endian(856736, 8682); + + # + # Firmware 6, type: STD FW (0x00000000), id: PAL/BG A2/A (0000000100000007), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000001, 0x00000007); # ID + write_le32(161); # Size + write_hunk_fix_endian(865424, 161); + + # + # Firmware 7, type: STD FW MTS (0x00000004), id: PAL/BG A2/A (0000000100000007), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000001, 0x00000007); # ID + write_le32(169); # Size + write_hunk_fix_endian(865592, 169); + + # + # Firmware 8, type: STD FW (0x00000000), id: PAL/BG A2/B (0000000200000007), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000002, 0x00000007); # ID + write_le32(161); # Size + write_hunk_fix_endian(865424, 161); + + # + # Firmware 9, type: STD FW MTS (0x00000004), id: PAL/BG A2/B (0000000200000007), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000002, 0x00000007); # ID + write_le32(169); # Size + write_hunk_fix_endian(865592, 169); + + # + # Firmware 10, type: STD FW (0x00000000), id: PAL/BG NICAM/A (0000000400000007), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000004, 0x00000007); # ID + write_le32(161); # Size + write_hunk_fix_endian(866112, 161); + + # + # Firmware 11, type: STD FW MTS (0x00000004), id: PAL/BG NICAM/A (0000000400000007), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000004, 0x00000007); # ID + write_le32(169); # Size + write_hunk_fix_endian(866280, 169); + + # + # Firmware 12, type: STD FW (0x00000000), id: PAL/BG NICAM/B (0000000800000007), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000008, 0x00000007); # ID + write_le32(161); # Size + write_hunk_fix_endian(866112, 161); + + # + # Firmware 13, type: STD FW MTS (0x00000004), id: PAL/BG NICAM/B (0000000800000007), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000008, 0x00000007); # ID + write_le32(169); # Size + write_hunk_fix_endian(866280, 169); + + # + # Firmware 14, type: STD FW (0x00000000), id: PAL/DK A2 (00000003000000e0), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000003, 0x000000e0); # ID + write_le32(161); # Size + write_hunk_fix_endian(866800, 161); + + # + # Firmware 15, type: STD FW MTS (0x00000004), id: PAL/DK A2 (00000003000000e0), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000003, 0x000000e0); # ID + write_le32(169); # Size + write_hunk_fix_endian(866968, 169); + + # + # Firmware 16, type: STD FW (0x00000000), id: PAL/DK NICAM (0000000c000000e0), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x0000000c, 0x000000e0); # ID + write_le32(161); # Size + write_hunk_fix_endian(867144, 161); + + # + # Firmware 17, type: STD FW MTS (0x00000004), id: PAL/DK NICAM (0000000c000000e0), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x0000000c, 0x000000e0); # ID + write_le32(169); # Size + write_hunk_fix_endian(867312, 169); + + # + # Firmware 18, type: STD FW (0x00000000), id: SECAM/K1 (0000000000200000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x00200000); # ID + write_le32(161); # Size + write_hunk_fix_endian(867488, 161); + + # + # Firmware 19, type: STD FW MTS (0x00000004), id: SECAM/K1 (0000000000200000), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000000, 0x00200000); # ID + write_le32(169); # Size + write_hunk_fix_endian(867656, 169); + + # + # Firmware 20, type: STD FW (0x00000000), id: SECAM/K3 (0000000004000000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x04000000); # ID + write_le32(161); # Size + write_hunk_fix_endian(867832, 161); + + # + # Firmware 21, type: STD FW MTS (0x00000004), id: SECAM/K3 (0000000004000000), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000000, 0x04000000); # ID + write_le32(169); # Size + write_hunk_fix_endian(868000, 169); + + # + # Firmware 22, type: STD FW D2633 DTV6 ATSC (0x00010030), id: (0000000000000000), size: 149 + # + + write_le32(0x00010030); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868176, 149); + + # + # Firmware 23, type: STD FW D2620 DTV6 QAM (0x00000068), id: (0000000000000000), size: 149 + # + + write_le32(0x00000068); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868336, 149); + + # + # Firmware 24, type: STD FW D2633 DTV6 QAM (0x00000070), id: (0000000000000000), size: 149 + # + + write_le32(0x00000070); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868488, 149); + + # + # Firmware 25, type: STD FW D2620 DTV7 (0x00000088), id: (0000000000000000), size: 149 + # + + write_le32(0x00000088); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868648, 149); + + # + # Firmware 26, type: STD FW D2633 DTV7 (0x00000090), id: (0000000000000000), size: 149 + # + + write_le32(0x00000090); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868800, 149); + + # + # Firmware 27, type: STD FW D2620 DTV78 (0x00000108), id: (0000000000000000), size: 149 + # + + write_le32(0x00000108); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868960, 149); + + # + # Firmware 28, type: STD FW D2633 DTV78 (0x00000110), id: (0000000000000000), size: 149 + # + + write_le32(0x00000110); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(869112, 149); + + # + # Firmware 29, type: STD FW D2620 DTV8 (0x00000208), id: (0000000000000000), size: 149 + # + + write_le32(0x00000208); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868648, 149); + + # + # Firmware 30, type: STD FW D2633 DTV8 (0x00000210), id: (0000000000000000), size: 149 + # + + write_le32(0x00000210); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(149); # Size + write_hunk_fix_endian(868800, 149); + + # + # Firmware 31, type: STD FW FM (0x00000400), id: (0000000000000000), size: 135 + # + + write_le32(0x00000400); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le32(135); # Size + write_hunk_fix_endian(869584, 135); + + # + # Firmware 32, type: STD FW (0x00000000), id: PAL/I (0000000000000010), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x00000010); # ID + write_le32(161); # Size + write_hunk_fix_endian(869728, 161); + + # + # Firmware 33, type: STD FW MTS (0x00000004), id: PAL/I (0000000000000010), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000000, 0x00000010); # ID + write_le32(169); # Size + write_hunk_fix_endian(869896, 169); + + # + # Firmware 34, type: STD FW (0x00000000), id: SECAM/L AM (0000001000400000), size: 169 + # + + write_le32(0x00000000); # Type + write_le64(0x00000010, 0x00400000); # ID + write_le32(169); # Size + write_hunk_fix_endian(870072, 169); + + # + # Firmware 35, type: STD FW (0x00000000), id: SECAM/L NICAM (0000000c00400000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x0000000c, 0x00400000); # ID + write_le32(161); # Size + write_hunk_fix_endian(870248, 161); + + # + # Firmware 36, type: STD FW (0x00000000), id: SECAM/Lc (0000000000800000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x00800000); # ID + write_le32(161); # Size + write_hunk_fix_endian(870416, 161); + + # + # Firmware 37, type: STD FW (0x00000000), id: NTSC/M Kr (0000000000008000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le32(161); # Size + write_hunk_fix_endian(870584, 161); + + # + # Firmware 38, type: STD FW LCD (0x00001000), id: NTSC/M Kr (0000000000008000), size: 161 + # + + write_le32(0x00001000); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le32(161); # Size + write_hunk_fix_endian(870752, 161); + + # + # Firmware 39, type: STD FW LCD NOGD (0x00003000), id: NTSC/M Kr (0000000000008000), size: 161 + # + + write_le32(0x00003000); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le32(161); # Size + write_hunk_fix_endian(870920, 161); + + # + # Firmware 40, type: STD FW MTS (0x00000004), id: NTSC/M Kr (0000000000008000), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le32(169); # Size + write_hunk_fix_endian(871088, 169); + + # + # Firmware 41, type: STD FW (0x00000000), id: NTSC PAL/M PAL/N (000000000000b700), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(161); # Size + write_hunk_fix_endian(871264, 161); + + # + # Firmware 42, type: STD FW LCD (0x00001000), id: NTSC PAL/M PAL/N (000000000000b700), size: 161 + # + + write_le32(0x00001000); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(161); # Size + write_hunk_fix_endian(871432, 161); + + # + # Firmware 43, type: STD FW LCD NOGD (0x00003000), id: NTSC PAL/M PAL/N (000000000000b700), size: 161 + # + + write_le32(0x00003000); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(161); # Size + write_hunk_fix_endian(871600, 161); + + # + # Firmware 44, type: STD FW (0x00000000), id: NTSC/M Jp (0000000000002000), size: 161 + # + + write_le32(0x00000000); # Type + write_le64(0x00000000, 0x00002000); # ID + write_le32(161); # Size + write_hunk_fix_endian(871264, 161); + + # + # Firmware 45, type: STD FW MTS (0x00000004), id: NTSC PAL/M PAL/N (000000000000b700), size: 169 + # + + write_le32(0x00000004); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(169); # Size + write_hunk_fix_endian(871936, 169); + + # + # Firmware 46, type: STD FW MTS LCD (0x00001004), id: NTSC PAL/M PAL/N (000000000000b700), size: 169 + # + + write_le32(0x00001004); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(169); # Size + write_hunk_fix_endian(872112, 169); + + # + # Firmware 47, type: STD FW MTS LCD NOGD (0x00003004), id: NTSC PAL/M PAL/N (000000000000b700), size: 169 + # + + write_le32(0x00003004); # Type + write_le64(0x00000000, 0x0000b700); # ID + write_le32(169); # Size + write_hunk_fix_endian(872288, 169); + + # + # Firmware 48, type: SCODE FW HAS IF (0x60000000), IF = 3.28 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(3280); # IF + write_le32(192); # Size + write_hunk(811896, 192); + + # + # Firmware 49, type: SCODE FW HAS IF (0x60000000), IF = 3.30 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(3300); # IF + write_le32(192); # Size + write_hunk(813048, 192); + + # + # Firmware 50, type: SCODE FW HAS IF (0x60000000), IF = 3.44 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(3440); # IF + write_le32(192); # Size + write_hunk(812280, 192); + + # + # Firmware 51, type: SCODE FW HAS IF (0x60000000), IF = 3.46 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(3460); # IF + write_le32(192); # Size + write_hunk(812472, 192); + + # + # Firmware 52, type: SCODE FW DTV6 ATSC OREN36 HAS IF (0x60210020), IF = 3.80 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60210020); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(3800); # IF + write_le32(192); # Size + write_hunk(809784, 192); + + # + # Firmware 53, type: SCODE FW HAS IF (0x60000000), IF = 4.00 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4000); # IF + write_le32(192); # Size + write_hunk(812088, 192); + + # + # Firmware 54, type: SCODE FW DTV6 ATSC TOYOTA388 HAS IF (0x60410020), IF = 4.08 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60410020); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4080); # IF + write_le32(192); # Size + write_hunk(809976, 192); + + # + # Firmware 55, type: SCODE FW HAS IF (0x60000000), IF = 4.20 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4200); # IF + write_le32(192); # Size + write_hunk(811704, 192); + + # + # Firmware 56, type: SCODE FW MONO HAS IF (0x60008000), IF = 4.32 MHz id: NTSC/M Kr (0000000000008000), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le16(4320); # IF + write_le32(192); # Size + write_hunk(808056, 192); + + # + # Firmware 57, type: SCODE FW HAS IF (0x60000000), IF = 4.45 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4450); # IF + write_le32(192); # Size + write_hunk(812664, 192); + + # + # Firmware 58, type: SCODE FW HAS IF (0x60000000), IF = 4.50 MHz id: NTSC/M Jp (0000000000002000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00002000); # ID + write_le16(4500); # IF + write_le32(192); # Size + write_hunk(807672, 192); + + # + # Firmware 59, type: SCODE FW LCD NOGD IF HAS IF (0x60023000), IF = 4.60 MHz id: NTSC/M Kr (0000000000008000), size: 192 + # + + write_le32(0x60023000); # Type + write_le64(0x00000000, 0x00008000); # ID + write_le16(4600); # IF + write_le32(192); # Size + write_hunk(807864, 192); + + # + # Firmware 60, type: SCODE FW DTV78 ZARLINK456 HAS IF (0x62000100), IF = 4.76 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x62000100); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4760); # IF + write_le32(192); # Size + write_hunk(807288, 192); + + # + # Firmware 61, type: SCODE FW HAS IF (0x60000000), IF = 4.94 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(4940); # IF + write_le32(192); # Size + write_hunk(811512, 192); + + # + # Firmware 62, type: SCODE FW DTV7 ZARLINK456 HAS IF (0x62000080), IF = 5.26 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x62000080); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(5260); # IF + write_le32(192); # Size + write_hunk(810552, 192); + + # + # Firmware 63, type: SCODE FW MONO HAS IF (0x60008000), IF = 5.32 MHz id: PAL/BG NICAM/B (0000000800000007), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000008, 0x00000007); # ID + write_le16(5320); # IF + write_le32(192); # Size + write_hunk(810744, 192); + + # + # Firmware 64, type: SCODE FW DTV8 CHINA HAS IF (0x64000200), IF = 5.40 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x64000200); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(5400); # IF + write_le32(192); # Size + write_hunk(807096, 192); + + # + # Firmware 65, type: SCODE FW DTV6 ATSC OREN538 HAS IF (0x60110020), IF = 5.58 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60110020); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(5580); # IF + write_le32(192); # Size + write_hunk(809592, 192); + + # + # Firmware 66, type: SCODE FW HAS IF (0x60000000), IF = 5.64 MHz id: PAL/BG A2/B (0000000200000007), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000002, 0x00000007); # ID + write_le16(5640); # IF + write_le32(192); # Size + write_hunk(808440, 192); + + # + # Firmware 67, type: SCODE FW HAS IF (0x60000000), IF = 5.74 MHz id: PAL/BG NICAM/B (0000000800000007), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000008, 0x00000007); # ID + write_le16(5740); # IF + write_le32(192); # Size + write_hunk(808632, 192); + + # + # Firmware 68, type: SCODE FW DTV7 DIBCOM52 HAS IF (0x61000080), IF = 5.90 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x61000080); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(5900); # IF + write_le32(192); # Size + write_hunk(810360, 192); + + # + # Firmware 69, type: SCODE FW MONO HAS IF (0x60008000), IF = 6.00 MHz id: PAL/I (0000000000000010), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000000, 0x00000010); # ID + write_le16(6000); # IF + write_le32(192); # Size + write_hunk(808824, 192); + + # + # Firmware 70, type: SCODE FW DTV6 QAM F6MHZ HAS IF (0x68000060), IF = 6.20 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x68000060); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(6200); # IF + write_le32(192); # Size + write_hunk(809400, 192); + + # + # Firmware 71, type: SCODE FW HAS IF (0x60000000), IF = 6.24 MHz id: PAL/I (0000000000000010), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000010); # ID + write_le16(6240); # IF + write_le32(192); # Size + write_hunk(808248, 192); + + # + # Firmware 72, type: SCODE FW MONO HAS IF (0x60008000), IF = 6.32 MHz id: SECAM/K1 (0000000000200000), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000000, 0x00200000); # ID + write_le16(6320); # IF + write_le32(192); # Size + write_hunk(811320, 192); + + # + # Firmware 73, type: SCODE FW HAS IF (0x60000000), IF = 6.34 MHz id: SECAM/K1 (0000000000200000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00200000); # ID + write_le16(6340); # IF + write_le32(192); # Size + write_hunk(809208, 192); + + # + # Firmware 74, type: SCODE FW MONO HAS IF (0x60008000), IF = 6.50 MHz id: SECAM/K3 (0000000004000000), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000000, 0x04000000); # ID + write_le16(6500); # IF + write_le32(192); # Size + write_hunk(811128, 192); + + # + # Firmware 75, type: SCODE FW DTV6 ATSC ATI638 HAS IF (0x60090020), IF = 6.58 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60090020); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(6580); # IF + write_le32(192); # Size + write_hunk(807480, 192); + + # + # Firmware 76, type: SCODE FW HAS IF (0x60000000), IF = 6.60 MHz id: PAL/DK A2 (00000003000000e0), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000003, 0x000000e0); # ID + write_le16(6600); # IF + write_le32(192); # Size + write_hunk(809016, 192); + + # + # Firmware 77, type: SCODE FW MONO HAS IF (0x60008000), IF = 6.68 MHz id: PAL/DK A2 (00000003000000e0), size: 192 + # + + write_le32(0x60008000); # Type + write_le64(0x00000003, 0x000000e0); # ID + write_le16(6680); # IF + write_le32(192); # Size + write_hunk(810936, 192); + + # + # Firmware 78, type: SCODE FW DTV6 ATSC TOYOTA794 HAS IF (0x60810020), IF = 8.14 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60810020); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(8140); # IF + write_le32(192); # Size + write_hunk(810168, 192); + + # + # Firmware 79, type: SCODE FW HAS IF (0x60000000), IF = 8.20 MHz id: (0000000000000000), size: 192 + # + + write_le32(0x60000000); # Type + write_le64(0x00000000, 0x00000000); # ID + write_le16(8200); # IF + write_le32(192); # Size + write_hunk(812856, 192); +} + +sub extract_firmware { + my $sourcefile = "hcw85bda.sys"; + my $hash = "0e44dbf63bb0169d57446aec21881ff2"; + my $outfile = "xc3028-v27.fw"; + my $name = "xc2028 firmware"; + my $version = 519; + my $nr_desc = 80; + my $out; + + verify($sourcefile, $hash); + + open INFILE, "<$sourcefile"; + main_firmware($outfile, $name, $version, $nr_desc); + close INFILE; +} + +extract_firmware; +printf "Firmwares generated.\n"; diff --git a/Documentation/video4linux/sn9c102.txt b/Documentation/video4linux/sn9c102.txt index 1ffad19ce891..b26f5195af51 100644 --- a/Documentation/video4linux/sn9c102.txt +++ b/Documentation/video4linux/sn9c102.txt @@ -568,6 +568,7 @@ the fingerprint is: '88E8 F32F 7244 68BA 3958 5D40 99DA 5D2A FCE6 35A4'. Many thanks to following persons for their contribute (listed in alphabetical order): +- David Anderson for the donation of a webcam; - Luca Capello for the donation of a webcam; - Philippe Coval for having helped testing the PAS202BCA image sensor; - Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt index 51ccc48aa763..f962d01bea2a 100644 --- a/Documentation/vm/hugetlbpage.txt +++ b/Documentation/vm/hugetlbpage.txt @@ -30,9 +30,10 @@ alignment and size of the arguments to the above system calls. The output of "cat /proc/meminfo" will have lines like: ..... -HugePages_Total: xxx -HugePages_Free: yyy -HugePages_Rsvd: www +HugePages_Total: vvv +HugePages_Free: www +HugePages_Rsvd: xxx +HugePages_Surp: yyy Hugepagesize: zzz kB where: @@ -42,6 +43,10 @@ allocated. HugePages_Rsvd is short for "reserved," and is the number of hugepages for which a commitment to allocate from the pool has been made, but no allocation has yet been made. It's vaguely analogous to overcommit. +HugePages_Surp is short for "surplus," and is the number of hugepages in +the pool above the value in /proc/sys/vm/nr_hugepages. The maximum +number of surplus hugepages is controlled by +/proc/sys/vm/nr_overcommit_hugepages. /proc/filesystems should also show a filesystem of type "hugetlbfs" configured in the kernel. @@ -71,7 +76,25 @@ or failure of allocation depends on the amount of physically contiguous memory that is preset in system at this time. System administrators may want to put this command in one of the local rc init files. This will enable the kernel to request huge pages early in the boot process (when the possibility -of getting physical contiguous pages is still very high). +of getting physical contiguous pages is still very high). In either +case, adminstrators will want to verify the number of hugepages actually +allocated by checking the sysctl or meminfo. + +/proc/sys/vm/nr_overcommit_hugepages indicates how large the pool of +hugepages can grow, if more hugepages than /proc/sys/vm/nr_hugepages are +requested by applications. echo'ing any non-zero value into this file +indicates that the hugetlb subsystem is allowed to try to obtain +hugepages from the buddy allocator, if the normal pool is exhausted. As +these surplus hugepages go out of use, they are freed back to the buddy +allocator. + +Caveat: Shrinking the pool via nr_hugepages while a surplus is in effect +will allow the number of surplus huge pages to exceed the overcommit +value, as the pool hugepages (which must have been in use for a surplus +hugepages to be allocated) will become surplus hugepages. As long as +this condition holds, however, no more surplus huge pages will be +allowed on the system until one of the two sysctls are increased +sufficiently, or the surplus huge pages go out of use and are freed. If the user applications are going to request hugepages using mmap system call, then it is required that system administrator mount a file system of @@ -94,8 +117,8 @@ provided on command line then no limits are set. For size and nr_inodes options, you can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For example, size=2K has the same meaning as size=2048. -read and write system calls are not supported on files that reside on hugetlb -file systems. +While read system calls are supported on files that reside on hugetlb +file systems, write system calls are not. Regular chown, chgrp, and chmod commands (with right permissions) could be used to change the file attributes on hugetlbfs. diff --git a/Documentation/vm/slabinfo.c b/Documentation/vm/slabinfo.c index 7047696c47a1..488c1f31b992 100644 --- a/Documentation/vm/slabinfo.c +++ b/Documentation/vm/slabinfo.c @@ -1021,7 +1021,7 @@ void read_slab_dir(void) char *t; int count; - if (chdir("/sys/slab")) + if (chdir("/sys/kernel/slab")) fatal("SYSFS support for SLUB not active\n"); dir = opendir("."); diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt index d17f324db9f5..dcf8bcf846d6 100644 --- a/Documentation/vm/slub.txt +++ b/Documentation/vm/slub.txt @@ -63,7 +63,7 @@ In case you forgot to enable debugging on the kernel command line: It is possible to enable debugging manually when the kernel is up. Look at the contents of: -/sys/slab/<slab name>/ +/sys/kernel/slab/<slab name>/ Look at the writable files. Writing 1 to them will enable the corresponding debug option. All options can be set on a slab that does diff --git a/Documentation/w1/masters/00-INDEX b/Documentation/w1/masters/00-INDEX index 752613c4cea2..7b0ceaaad7af 100644 --- a/Documentation/w1/masters/00-INDEX +++ b/Documentation/w1/masters/00-INDEX @@ -4,3 +4,5 @@ ds2482 - The Maxim/Dallas Semiconductor DS2482 provides 1-wire busses. ds2490 - The Maxim/Dallas Semiconductor DS2490 builds USB <-> W1 bridges. +w1-gpio + - GPIO 1-wire bus master driver. diff --git a/Documentation/w1/masters/w1-gpio b/Documentation/w1/masters/w1-gpio new file mode 100644 index 000000000000..af5d3b4aa851 --- /dev/null +++ b/Documentation/w1/masters/w1-gpio @@ -0,0 +1,33 @@ +Kernel driver w1-gpio +===================== + +Author: Ville Syrjala <syrjala@sci.fi> + + +Description +----------- + +GPIO 1-wire bus master driver. The driver uses the GPIO API to control the +wire and the GPIO pin can be specified using platform data. + + +Example (mach-at91) +------------------- + +#include <linux/w1-gpio.h> + +static struct w1_gpio_platform_data foo_w1_gpio_pdata = { + .pin = AT91_PIN_PB20, + .is_open_drain = 1, +}; + +static struct platform_device foo_w1_device = { + .name = "w1-gpio", + .id = -1, + .dev.platform_data = &foo_w1_gpio_pdata, +}; + +... + at91_set_GPIO_periph(foo_w1_gpio_pdata.pin, 1); + at91_set_multi_drive(foo_w1_gpio_pdata.pin, 1); + platform_device_register(&foo_w1_device); diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt index bb7cb1d31ec7..4cc4ba9d7150 100644 --- a/Documentation/watchdog/watchdog-api.txt +++ b/Documentation/watchdog/watchdog-api.txt @@ -42,23 +42,27 @@ like this source file: see Documentation/watchdog/src/watchdog-simple.c A more advanced driver could for example check that a HTTP server is still responding before doing the write call to ping the watchdog. -When the device is closed, the watchdog is disabled. This is not -always such a good idea, since if there is a bug in the watchdog -daemon and it crashes the system will not reboot. Because of this, -some of the drivers support the configuration option "Disable watchdog -shutdown on close", CONFIG_WATCHDOG_NOWAYOUT. If it is set to Y when -compiling the kernel, there is no way of disabling the watchdog once -it has been started. So, if the watchdog daemon crashes, the system -will reboot after the timeout has passed. Watchdog devices also usually -support the nowayout module parameter so that this option can be controlled -at runtime. - -Drivers will not disable the watchdog, unless a specific magic character 'V' -has been sent /dev/watchdog just before closing the file. If the userspace -daemon closes the file without sending this special character, the driver -will assume that the daemon (and userspace in general) died, and will stop -pinging the watchdog without disabling it first. This will then cause a -reboot if the watchdog is not re-opened in sufficient time. +When the device is closed, the watchdog is disabled, unless the "Magic +Close" feature is supported (see below). This is not always such a +good idea, since if there is a bug in the watchdog daemon and it +crashes the system will not reboot. Because of this, some of the +drivers support the configuration option "Disable watchdog shutdown on +close", CONFIG_WATCHDOG_NOWAYOUT. If it is set to Y when compiling +the kernel, there is no way of disabling the watchdog once it has been +started. So, if the watchdog daemon crashes, the system will reboot +after the timeout has passed. Watchdog devices also usually support +the nowayout module parameter so that this option can be controlled at +runtime. + +Magic Close feature: + +If a driver supports "Magic Close", the driver will not disable the +watchdog unless a specific magic character 'V' has been sent to +/dev/watchdog just before closing the file. If the userspace daemon +closes the file without sending this special character, the driver +will assume that the daemon (and userspace in general) died, and will +stop pinging the watchdog without disabling it first. This will then +cause a reboot if the watchdog is not re-opened in sufficient time. The ioctl API: diff --git a/Documentation/x86_64/00-INDEX b/Documentation/x86_64/00-INDEX new file mode 100644 index 000000000000..92fc20ab5f0e --- /dev/null +++ b/Documentation/x86_64/00-INDEX @@ -0,0 +1,16 @@ +00-INDEX + - This file +boot-options.txt + - AMD64-specific boot options. +cpu-hotplug-spec + - Firmware support for CPU hotplug under Linux/x86-64 +fake-numa-for-cpusets + - Using numa=fake and CPUSets for Resource Management +kernel-stacks + - Context-specific per-processor interrupt stacks. +machinecheck + - Configurable sysfs parameters for the x86-64 machine check code. +mm.txt + - Memory layout of x86-64 (4 level page tables, 46 bits physical). +uefi.txt + - Booting Linux via Unified Extensible Firmware Interface. diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt index 945311840a10..34abae4e9442 100644 --- a/Documentation/x86_64/boot-options.txt +++ b/Documentation/x86_64/boot-options.txt @@ -110,12 +110,18 @@ Idle loop Rebooting - reboot=b[ios] | t[riple] | k[bd] [, [w]arm | [c]old] + reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old] bios Use the CPU reboot vector for warm reset warm Don't set the cold reboot flag cold Set the cold reboot flag triple Force a triple fault (init) kbd Use the keyboard controller. cold reset (default) + acpi Use the ACPI RESET_REG in the FADT. If ACPI is not configured or the + ACPI reset does not work, the reboot path attempts the reset using + the keyboard controller. + efi Use efi reset_system runtime service. If EFI is not configured or the + EFI reset does not work, the reboot path attempts the reset using + the keyboard controller. Using warm reset will be much faster especially on big memory systems because the BIOS will not go through the memory check. diff --git a/Documentation/x86_64/uefi.txt b/Documentation/x86_64/uefi.txt new file mode 100644 index 000000000000..7d77120a5184 --- /dev/null +++ b/Documentation/x86_64/uefi.txt @@ -0,0 +1,38 @@ +General note on [U]EFI x86_64 support +------------------------------------- + +The nomenclature EFI and UEFI are used interchangeably in this document. + +Although the tools below are _not_ needed for building the kernel, +the needed bootloader support and associated tools for x86_64 platforms +with EFI firmware and specifications are listed below. + +1. UEFI specification: http://www.uefi.org + +2. Booting Linux kernel on UEFI x86_64 platform requires bootloader + support. Elilo with x86_64 support can be used. + +3. x86_64 platform with EFI/UEFI firmware. + +Mechanics: +--------- +- Build the kernel with the following configuration. + CONFIG_FB_EFI=y + CONFIG_FRAMEBUFFER_CONSOLE=y + If EFI runtime services are expected, the following configuration should + be selected. + CONFIG_EFI=y + CONFIG_EFI_VARS=y or m # optional +- Create a VFAT partition on the disk +- Copy the following to the VFAT partition: + elilo bootloader with x86_64 support, elilo configuration file, + kernel image built in first step and corresponding + initrd. Instructions on building elilo and its dependencies + can be found in the elilo sourceforge project. +- Boot to EFI shell and invoke elilo choosing the kernel image built + in first step. +- If some or all EFI runtime services don't work, you can try following + kernel command line parameters to turn off some or all EFI runtime + services. + noefi turn off all EFI runtime services + reboot_type=k turn off EFI reboot runtime service diff --git a/Documentation/zh_CN/CodingStyle b/Documentation/zh_CN/CodingStyle new file mode 100644 index 000000000000..ecd9307a641f --- /dev/null +++ b/Documentation/zh_CN/CodingStyle @@ -0,0 +1,701 @@ +Chinese translated version of Documentation/CodingStyle + +If you have any comment or update to the content, please post to LKML directly. +However, if you have problem communicating in English you can also ask the +Chinese maintainer for help. Contact the Chinese maintainer, if this +translation is outdated or there is problem with translation. + +Chinese maintainer: Zhang Le <r0bertz@gentoo.org> +--------------------------------------------------------------------- +Documentation/CodingStyle的中文翻译 + +如果想评论或更新本文的内容,请直接发信到LKML。如果你使用英文交流有困难的话,也可 +以向中文版维护者求助。如果本翻译更新不及时或者翻译存在问题,请联系中文版维护者。 + +中文版维护者: 张乐 Zhang Le <r0bertz@gentoo.org> +中文版翻译者: 张乐 Zhang Le <r0bertz@gentoo.org> +中文版校译者: 王聪 Wang Cong <xiyou.wangcong@gmail.com> + wheelz <kernel.zeng@gmail.com> + 管旭东 Xudong Guan <xudong.guan@gmail.com> + Li Zefan <lizf@cn.fujitsu.com> + Wang Chen <wangchen@cn.fujitsu.com> +以下为正文 +--------------------------------------------------------------------- + + Linux内核代码风格 + +这是一个简短的文档,描述了linux内核的首选代码风格。代码风格是因人而异的,而且我 +不愿意把我的观点强加给任何人,不过这里所讲述的是我必须要维护的代码所遵守的风格, +并且我也希望绝大多数其他代码也能遵守这个风格。请在写代码时至少考虑一下本文所述的 +风格。 + +首先,我建议你打印一份GNU代码规范,然后不要读它。烧了它,这是一个具有重大象征性 +意义的动作。 + +不管怎样,现在我们开始: + + + 第一章:缩进 + +制表符是8个字符,所以缩进也是8个字符。有些异端运动试图将缩进变为4(乃至2)个字符 +深,这几乎相当于尝试将圆周率的值定义为3。 + +理由:缩进的全部意义就在于清楚的定义一个控制块起止于何处。尤其是当你盯着你的屏幕 +连续看了20小时之后,你将会发现大一点的缩进会使你更容易分辨缩进。 + +现在,有些人会抱怨8个字符的缩进会使代码向右边移动的太远,在80个字符的终端屏幕上 +就很难读这样的代码。这个问题的答案是,如果你需要3级以上的缩进,不管用何种方式你 +的代码已经有问题了,应该修正你的程序。 + +简而言之,8个字符的缩进可以让代码更容易阅读,还有一个好处是当你的函数嵌套太深的 +时候可以给你警告。留心这个警告。 + +在switch语句中消除多级缩进的首选的方式是让“switch”和从属于它的“case”标签对齐于同 +一列,而不要“两次缩进”“case”标签。比如: + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + /* fall through */ + default: + break; + } + + +不要把多个语句放在一行里,除非你有什么东西要隐藏: + + if (condition) do_this; + do_something_everytime; + +也不要在一行里放多个赋值语句。内核代码风格超级简单。就是避免可能导致别人误读的表 +达式。 + +除了注释、文档和Kconfig之外,不要使用空格来缩进,前面的例子是例外,是有意为之。 + +选用一个好的编辑器,不要在行尾留空格。 + + + 第二章:把长的行和字符串打散 + +代码风格的意义就在于使用平常使用的工具来维持代码的可读性和可维护性。 + +每一行的长度的限制是80列,我们强烈建议您遵守这个惯例。 + +长于80列的语句要打散成有意义的片段。每个片段要明显短于原来的语句,而且放置的位置 +也明显的靠右。同样的规则也适用于有很长参数列表的函数头。长字符串也要打散成较短的 +字符串。唯一的例外是超过80列可以大幅度提高可读性并且不会隐藏信息的情况。 + +void fun(int a, int b, int c) +{ + if (condition) + printk(KERN_WARNING "Warning this is a long printk with " + "3 parameters a: %u b: %u " + "c: %u \n", a, b, c); + else + next_statement; +} + + 第三章:大括号和空格的放置 + +C语言风格中另外一个常见问题是大括号的放置。和缩进大小不同,选择或弃用某种放置策 +略并没有多少技术上的原因,不过首选的方式,就像Kernighan和Ritchie展示给我们的,是 +把起始大括号放在行尾,而把结束大括号放在行首,所以: + + if (x is true) { + we do y + } + +这适用于所有的非函数语句块(if、switch、for、while、do)。比如: + + switch (action) { + case KOBJ_ADD: + return "add"; + case KOBJ_REMOVE: + return "remove"; + case KOBJ_CHANGE: + return "change"; + default: + return NULL; + } + +不过,有一个例外,那就是函数:函数的起始大括号放置于下一行的开头,所以: + + int function(int x) + { + body of function + } + +全世界的异端可能会抱怨这个不一致性是……呃……不一致的,不过所有思维健全的人都知道( +a)K&R是_正确的_,并且(b)K&R是正确的。此外,不管怎样函数都是特殊的(在C语言中 +,函数是不能嵌套的)。 + +注意结束大括号独自占据一行,除非它后面跟着同一个语句的剩余部分,也就是do语句中的 +“while”或者if语句中的“else”,像这样: + + do { + body of do-loop + } while (condition); + +和 + + if (x == y) { + .. + } else if (x > y) { + ... + } else { + .... + } + +理由:K&R。 + +也请注意这种大括号的放置方式也能使空(或者差不多空的)行的数量最小化,同时不失可 +读性。因此,由于你的屏幕上的新行是不可再生资源(想想25行的终端屏幕),你将会有更 +多的空行来放置注释。 + +当只有一个单独的语句的时候,不用加不必要的大括号。 + +if (condition) + action(); + +这点不适用于本身为某个条件语句的一个分支的单独语句。这时需要在两个分支里都使用大 +括号。 + +if (condition) { + do_this(); + do_that(); +} else { + otherwise(); +} + + 3.1:空格 + +Linux内核的空格使用方式(主要)取决于它是用于函数还是关键字。(大多数)关键字后 +要加一个空格。值得注意的例外是sizeof、typeof、alignof和__attribute__,这些关键字 +某些程度上看起来更像函数(它们在Linux里也常常伴随小括号而使用,尽管在C语言里这样 +的小括号不是必需的,就像“struct fileinfo info”声明过后的“sizeof info”)。 + +所以在这些关键字之后放一个空格: + if, switch, case, for, do, while +但是不要在sizeof、typeof、alignof或者__attribute__这些关键字之后放空格。例如, + s = sizeof(struct file); + +不要在小括号里的表达式两侧加空格。这是一个反例: + + s = sizeof( struct file ); + +当声明指针类型或者返回指针类型的函数时,“*”的首选使用方式是使之靠近变量名或者函 +数名,而不是靠近类型名。例子: + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + +在大多数二元和三元操作符两侧使用一个空格,例如下面所有这些操作符: + + = + - < > * / % | & ^ <= >= == != ? : + +但是一元操作符后不要加空格: + & * + - ~ ! sizeof typeof alignof __attribute__ defined + +后缀自加和自减一元操作符前不加空格: + ++ -- + +前缀自加和自减一元操作符后不加空格: + ++ -- + +“.”和“->”结构体成员操作符前后不加空格。 + +不要在行尾留空白。有些可以自动缩进的编辑器会在新行的行首加入适量的空白,然后你 +就可以直接在那一行输入代码。不过假如你最后没有在那一行输入代码,有些编辑器就不 +会移除已经加入的空白,就像你故意留下一个只有空白的行。包含行尾空白的行就这样产 +生了。 + +当git发现补丁包含了行尾空白的时候会警告你,并且可以应你的要求去掉行尾空白;不过 +如果你是正在打一系列补丁,这样做会导致后面的补丁失败,因为你改变了补丁的上下文。 + + + 第四章:命名 + +C是一个简朴的语言,你的命名也应该这样。和Modula-2和Pascal程序员不同,C程序员不使 +用类似ThisVariableIsATemporaryCounter这样华丽的名字。C程序员会称那个变量为“tmp” +,这样写起来会更容易,而且至少不会令其难于理解。 + +不过,虽然混用大小写的名字是不提倡使用的,但是全局变量还是需要一个具描述性的名字 +。称一个全局函数为“foo”是一个难以饶恕的错误。 + +全局变量(只有当你真正需要它们的时候再用它)需要有一个具描述性的名字,就像全局函 +数。如果你有一个可以计算活动用户数量的函数,你应该叫它“count_active_users()”或者 +类似的名字,你不应该叫它“cntuser()”。 + +在函数名中包含函数类型(所谓的匈牙利命名法)是脑子出了问题——编译器知道那些类型而 +且能够检查那些类型,这样做只能把程序员弄糊涂了。难怪微软总是制造出有问题的程序。 + +本地变量名应该简短,而且能够表达相关的含义。如果你有一些随机的整数型的循环计数器 +,它应该被称为“i”。叫它“loop_counter”并无益处,如果它没有被误解的可能的话。类似 +的,“tmp”可以用来称呼任意类型的临时变量。 + +如果你怕混淆了你的本地变量名,你就遇到另一个问题了,叫做函数增长荷尔蒙失衡综合症 +。请看第六章(函数)。 + + + 第五章:Typedef + +不要使用类似“vps_t”之类的东西。 + +对结构体和指针使用typedef是一个错误。当你在代码里看到: + + vps_t a; + +这代表什么意思呢? + +相反,如果是这样 + + struct virtual_container *a; + +你就知道“a”是什么了。 + +很多人认为typedef“能提高可读性”。实际不是这样的。它们只在下列情况下有用: + + (a) 完全不透明的对象(这种情况下要主动使用typedef来隐藏这个对象实际上是什么)。 + + 例如:“pte_t”等不透明对象,你只能用合适的访问函数来访问它们。 + + 注意!不透明性和“访问函数”本身是不好的。我们使用pte_t等类型的原因在于真的是 + 完全没有任何共用的可访问信息。 + + (b) 清楚的整数类型,如此,这层抽象就可以帮助消除到底是“int”还是“long”的混淆。 + + u8/u16/u32是完全没有问题的typedef,不过它们更符合类别(d)而不是这里。 + + 再次注意!要这样做,必须事出有因。如果某个变量是“unsigned long“,那么没有必要 + + typedef unsigned long myflags_t; + + 不过如果有一个明确的原因,比如它在某种情况下可能会是一个“unsigned int”而在 + 其他情况下可能为“unsigned long”,那么就不要犹豫,请务必使用typedef。 + + (c) 当你使用sparse按字面的创建一个新类型来做类型检查的时候。 + + (d) 和标准C99类型相同的类型,在某些例外的情况下。 + + 虽然让眼睛和脑筋来适应新的标准类型比如“uint32_t”不需要花很多时间,可是有些 + 人仍然拒绝使用它们。 + + 因此,Linux特有的等同于标准类型的“u8/u16/u32/u64”类型和它们的有符号类型是被 + 允许的——尽管在你自己的新代码中,它们不是强制要求要使用的。 + + 当编辑已经使用了某个类型集的已有代码时,你应该遵循那些代码中已经做出的选择。 + + (e) 可以在用户空间安全使用的类型。 + + 在某些用户空间可见的结构体里,我们不能要求C99类型而且不能用上面提到的“u32” + 类型。因此,我们在与用户空间共享的所有结构体中使用__u32和类似的类型。 + +可能还有其他的情况,不过基本的规则是永远不要使用typedef,除非你可以明确的应用上 +述某个规则中的一个。 + +总的来说,如果一个指针或者一个结构体里的元素可以合理的被直接访问到,那么它们就不 +应该是一个typedef。 + + + 第六章:函数 + +函数应该简短而漂亮,并且只完成一件事情。函数应该可以一屏或者两屏显示完(我们都知 +道ISO/ANSI屏幕大小是80x24),只做一件事情,而且把它做好。 + +一个函数的最大长度是和该函数的复杂度和缩进级数成反比的。所以,如果你有一个理论上 +很简单的只有一个很长(但是简单)的case语句的函数,而且你需要在每个case里做很多很 +小的事情,这样的函数尽管很长,但也是可以的。 + +不过,如果你有一个复杂的函数,而且你怀疑一个天分不是很高的高中一年级学生可能甚至 +搞不清楚这个函数的目的,你应该严格的遵守前面提到的长度限制。使用辅助函数,并为之 +取个具描述性的名字(如果你觉得它们的性能很重要的话,可以让编译器内联它们,这样的 +效果往往会比你写一个复杂函数的效果要好。) + +函数的另外一个衡量标准是本地变量的数量。此数量不应超过5-10个,否则你的函数就有 +问题了。重新考虑一下你的函数,把它分拆成更小的函数。人的大脑一般可以轻松的同时跟 +踪7个不同的事物,如果再增多的话,就会糊涂了。即便你聪颖过人,你也可能会记不清你2 +个星期前做过的事情。 + +在源文件里,使用空行隔开不同的函数。如果该函数需要被导出,它的EXPORT*宏应该紧贴 +在它的结束大括号之下。比如: + +int system_is_up(void) +{ + return system_state == SYSTEM_RUNNING; +} +EXPORT_SYMBOL(system_is_up); + +在函数原型中,包含函数名和它们的数据类型。虽然C语言里没有这样的要求,在Linux里这 +是提倡的做法,因为这样可以很简单的给读者提供更多的有价值的信息。 + + + 第七章:集中的函数退出途径 + +虽然被某些人声称已经过时,但是goto语句的等价物还是经常被编译器所使用,具体形式是 +无条件跳转指令。 + +当一个函数从多个位置退出并且需要做一些通用的清理工作的时候,goto的好处就显现出来 +了。 + +理由是: + +- 无条件语句容易理解和跟踪 +- 嵌套程度减小 +- 可以避免由于修改时忘记更新某个单独的退出点而导致的错误 +- 减轻了编译器的工作,无需删除冗余代码;) + +int fun(int a) +{ + int result = 0; + char *buffer = kmalloc(SIZE); + + if (buffer == NULL) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out; + } + ... +out: + kfree(buffer); + return result; +} + + 第八章:注释 + +注释是好的,不过有过度注释的危险。永远不要在注释里解释你的代码是如何运作的:更好 +的做法是让别人一看你的代码就可以明白,解释写的很差的代码是浪费时间。 + +一般的,你想要你的注释告诉别人你的代码做了什么,而不是怎么做的。也请你不要把注释 +放在一个函数体内部:如果函数复杂到你需要独立的注释其中的一部分,你很可能需要回到 +第六章看一看。你可以做一些小注释来注明或警告某些很聪明(或者槽糕)的做法,但不要 +加太多。你应该做的,是把注释放在函数的头部,告诉人们它做了什么,也可以加上它做这 +些事情的原因。 + +当注释内核API函数时,请使用kernel-doc格式。请看 +Documentation/kernel-doc-nano-HOWTO.txt和scripts/kernel-doc以获得详细信息。 + +Linux的注释风格是C89“/* ... */”风格。不要使用C99风格“// ...”注释。 + +长(多行)的首选注释风格是: + + /* + * This is the preferred style for multi-line + * comments in the Linux kernel source code. + * Please use it consistently. + * + * Description: A column of asterisks on the left side, + * with beginning and ending almost-blank lines. + */ + +注释数据也是很重要的,不管是基本类型还是衍生类型。为了方便实现这一点,每一行应只 +声明一个数据(不要使用逗号来一次声明多个数据)。这样你就有空间来为每个数据写一段 +小注释来解释它们的用途了。 + + + 第九章:你已经把事情弄糟了 + +这没什么,我们都是这样。可能你的使用了很长时间Unix的朋友已经告诉你“GNU emacs”能 +自动帮你格式化C源代码,而且你也注意到了,确实是这样,不过它所使用的默认值和我们 +想要的相去甚远(实际上,甚至比随机打的还要差——无数个猴子在GNU emacs里打字永远不 +会创造出一个好程序)(译注:请参考Infinite Monkey Theorem) + +所以你要么放弃GNU emacs,要么改变它让它使用更合理的设定。要采用后一个方案,你可 +以把下面这段粘贴到你的.emacs文件里。 + +(defun linux-c-mode () + "C mode with adjusted defaults for use with the Linux kernel." + (interactive) + (c-mode) + (c-set-style "K&R") + (setq tab-width 8) + (setq indent-tabs-mode t) + (setq c-basic-offset 8)) + +这样就定义了M-x linux-c-mode命令。当你hack一个模块的时候,如果你把字符串 +-*- linux-c -*-放在头两行的某个位置,这个模式将会被自动调用。如果你希望在你修改 +/usr/src/linux里的文件时魔术般自动打开linux-c-mode的话,你也可能需要添加 + +(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode) + auto-mode-alist)) + +到你的.emacs文件里。 + +不过就算你尝试让emacs正确的格式化代码失败了,也并不意味着你失去了一切:还可以用“ +indent”。 + +不过,GNU indent也有和GNU emacs一样有问题的设定,所以你需要给它一些命令选项。不 +过,这还不算太糟糕,因为就算是GNU indent的作者也认同K&R的权威性(GNU的人并不是坏 +人,他们只是在这个问题上被严重的误导了),所以你只要给indent指定选项“-kr -i8” +(代表“K&R,8个字符缩进”),或者使用“scripts/Lindent”,这样就可以以最时髦的方式 +缩进源代码。 + +“indent”有很多选项,特别是重新格式化注释的时候,你可能需要看一下它的手册页。不过 +记住:“indent”不能修正坏的编程习惯。 + + + 第十章:Kconfig配置文件 + +对于遍布源码树的所有Kconfig*配置文件来说,它们缩进方式与C代码相比有所不同。紧挨 +在“config”定义下面的行缩进一个制表符,帮助信息则再多缩进2个空格。比如: + +config AUDIT + bool "Auditing support" + depends on NET + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for + logging of avc messages output). Does not do system-call + auditing without CONFIG_AUDITSYSCALL. + +仍然被认为不够稳定的功能应该被定义为依赖于“EXPERIMENTAL”: + +config SLUB + depends on EXPERIMENTAL && !ARCH_USES_SLAB_PAGE_STRUCT + bool "SLUB (Unqueued Allocator)" + ... + +而那些危险的功能(比如某些文件系统的写支持)应该在它们的提示字符串里显著的声明这 +一点: + +config ADFS_FS_RW + bool "ADFS write support (DANGEROUS)" + depends on ADFS_FS + ... + +要查看配置文件的完整文档,请看Documentation/kbuild/kconfig-language.txt。 + + + 第十一章:数据结构 + +如果一个数据结构,在创建和销毁它的单线执行环境之外可见,那么它必须要有一个引用计 +数器。内核里没有垃圾收集(并且内核之外的垃圾收集慢且效率低下),这意味着你绝对需 +要记录你对这种数据结构的使用情况。 + +引用计数意味着你能够避免上锁,并且允许多个用户并行访问这个数据结构——而不需要担心 +这个数据结构仅仅因为暂时不被使用就消失了,那些用户可能不过是沉睡了一阵或者做了一 +些其他事情而已。 + +注意上锁不能取代引用计数。上锁是为了保持数据结构的一致性,而引用计数是一个内存管 +理技巧。通常二者都需要,不要把两个搞混了。 + +很多数据结构实际上有2级引用计数,它们通常有不同“类”的用户。子类计数器统计子类用 +户的数量,每当子类计数器减至零时,全局计数器减一。 + +这种“多级引用计数”的例子可以在内存管理(“struct mm_struct”:mm_users和mm_count) +和文件系统(“struct super_block”:s_count和s_active)中找到。 + +记住:如果另一个执行线索可以找到你的数据结构,但是这个数据结构没有引用计数器,这 +里几乎肯定是一个bug。 + + + 第十二章:宏,枚举和RTL + +用于定义常量的宏的名字及枚举里的标签需要大写。 + +#define CONSTANT 0x12345 + +在定义几个相关的常量时,最好用枚举。 + +宏的名字请用大写字母,不过形如函数的宏的名字可以用小写字母。 + +一般的,如果能写成内联函数就不要写成像函数的宏。 + +含有多个语句的宏应该被包含在一个do-while代码块里: + +#define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + +使用宏的时候应避免的事情: + +1) 影响控制流程的宏: + +#define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while(0) + +非常不好。它看起来像一个函数,不过却能导致“调用”它的函数退出;不要打乱读者大脑里 +的语法分析器。 + +2) 依赖于一个固定名字的本地变量的宏: + +#define FOO(val) bar(index, val) + +可能看起来像是个不错的东西,不过它非常容易把读代码的人搞糊涂,而且容易导致看起来 +不相关的改动带来错误。 + +3) 作为左值的带参数的宏: FOO(x) = y;如果有人把FOO变成一个内联函数的话,这种用 +法就会出错了。 + +4) 忘记了优先级:使用表达式定义常量的宏必须将表达式置于一对小括号之内。带参数的 +宏也要注意此类问题。 + +#define CONSTANT 0x4000 +#define CONSTEXP (CONSTANT | 3) + +cpp手册对宏的讲解很详细。Gcc internals手册也详细讲解了RTL(译注:register +transfer language),内核里的汇编语言经常用到它。 + + + 第十三章:打印内核消息 + +内核开发者应该是受过良好教育的。请一定注意内核信息的拼写,以给人以好的印象。不要 +用不规范的单词比如“dont”,而要用“do not”或者“don't”。保证这些信息简单、明了、无 +歧义。 + +内核信息不必以句号(译注:英文句号,即点)结束。 + +在小括号里打印数字(%d)没有任何价值,应该避免这样做。 + +<linux/device.h>里有一些驱动模型诊断宏,你应该使用它们,以确保信息对应于正确的 +设备和驱动,并且被标记了正确的消息级别。这些宏有:dev_err(), dev_warn(), +dev_info()等等。对于那些不和某个特定设备相关连的信息,<linux/kernel.h>定义了 +pr_debug()和pr_info()。 + +写出好的调试信息可以是一个很大的挑战;当你写出来之后,这些信息在远程除错的时候 +就会成为极大的帮助。当DEBUG符号没有被定义的时候,这些信息不应该被编译进内核里 +(也就是说,默认地,它们不应该被包含在内)。如果你使用dev_dbg()或者pr_debug(), +就能自动达到这个效果。很多子系统拥有Kconfig选项来启用-DDEBUG。还有一个相关的惯例 +是使用VERBOSE_DEBUG来添加dev_vdbg()消息到那些已经由DEBUG启用的消息之上。 + + + 第十四章:分配内存 + +内核提供了下面的一般用途的内存分配函数:kmalloc(),kzalloc(),kcalloc()和 +vmalloc()。请参考API文档以获取有关它们的详细信息。 + +传递结构体大小的首选形式是这样的: + + p = kmalloc(sizeof(*p), ...); + +另外一种传递方式中,sizeof的操作数是结构体的名字,这样会降低可读性,并且可能会引 +入bug。有可能指针变量类型被改变时,而对应的传递给内存分配函数的sizeof的结果不变。 + +强制转换一个void指针返回值是多余的。C语言本身保证了从void指针到其他任何指针类型 +的转换是没有问题的。 + + + 第十五章:内联弊病 + +有一个常见的误解是内联函数是gcc提供的可以让代码运行更快的一个选项。虽然使用内联 +函数有时候是恰当的(比如作为一种替代宏的方式,请看第十二章),不过很多情况下不是 +这样。inline关键字的过度使用会使内核变大,从而使整个系统运行速度变慢。因为大内核 +会占用更多的指令高速缓存(译注:一级缓存通常是指令缓存和数据缓存分开的)而且会导 +致pagecache的可用内存减少。想象一下,一次pagecache未命中就会导致一次磁盘寻址,将 +耗时5毫秒。5毫秒的时间内CPU能执行很多很多指令。 + +一个基本的原则是如果一个函数有3行以上,就不要把它变成内联函数。这个原则的一个例 +外是,如果你知道某个参数是一个编译时常量,而且因为这个常量你确定编译器在编译时能 +优化掉你的函数的大部分代码,那仍然可以给它加上inline关键字。kmalloc()内联函数就 +是一个很好的例子。 + +人们经常主张给static的而且只用了一次的函数加上inline,如此不会有任何损失,因为没 +有什么好权衡的。虽然从技术上说这是正确的,但是实际上这种情况下即使不加inline gcc +也可以自动使其内联。而且其他用户可能会要求移除inline,由此而来的争论会抵消inline +自身的潜在价值,得不偿失。 + + + 第十六章:函数返回值及命名 + +函数可以返回很多种不同类型的值,最常见的一种是表明函数执行成功或者失败的值。这样 +的一个值可以表示为一个错误代码整数(-Exxx=失败,0=成功)或者一个“成功”布尔值( +0=失败,非0=成功)。 + +混合使用这两种表达方式是难于发现的bug的来源。如果C语言本身严格区分整形和布尔型变 +量,那么编译器就能够帮我们发现这些错误……不过C语言不区分。为了避免产生这种bug,请 +遵循下面的惯例: + + 如果函数的名字是一个动作或者强制性的命令,那么这个函数应该返回错误代码整 + 数。如果是一个判断,那么函数应该返回一个“成功”布尔值。 + +比如,“add work”是一个命令,所以add_work()函数在成功时返回0,在失败时返回-EBUSY。 +类似的,因为“PCI device present”是一个判断,所以pci_dev_present()函数在成功找到 +一个匹配的设备时应该返回1,如果找不到时应该返回0。 + +所有导出(译注:EXPORT)的函数都必须遵守这个惯例,所有的公共函数也都应该如此。私 +有(static)函数不需要如此,但是我们也推荐这样做。 + +返回值是实际计算结果而不是计算是否成功的标志的函数不受此惯例的限制。一般的,他们 +通过返回一些正常值范围之外的结果来表示出错。典型的例子是返回指针的函数,他们使用 +NULL或者ERR_PTR机制来报告错误。 + + + 第十七章:不要重新发明内核宏 + +头文件include/linux/kernel.h包含了一些宏,你应该使用它们,而不要自己写一些它们的 +变种。比如,如果你需要计算一个数组的长度,使用这个宏 + + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +类似的,如果你要计算某结构体成员的大小,使用 + + #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) + +还有可以做严格的类型检查的min()和max()宏,如果你需要可以使用它们。你可以自己看看 +那个头文件里还定义了什么你可以拿来用的东西,如果有定义的话,你就不应在你的代码里 +自己重新定义。 + + + 第十八章:编辑器模式行和其他需要罗嗦的事情 + +有一些编辑器可以解释嵌入在源文件里的由一些特殊标记标明的配置信息。比如,emacs +能够解释被标记成这样的行: + +-*- mode: c -*- + +或者这样的: + +/* +Local Variables: +compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" +End: +*/ + +Vim能够解释这样的标记: + +/* vim:set sw=8 noet */ + +不要在源代码中包含任何这样的内容。每个人都有他自己的编辑器配置,你的源文件不应 +该覆盖别人的配置。这包括有关缩进和模式配置的标记。人们可以使用他们自己定制的模 +式,或者使用其他可以产生正确的缩进的巧妙方法。 + + + + 附录 I:参考 + +The C Programming Language, 第二版, 作者Brian W. Kernighan和Denni +M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (软皮), +0-13-110370-9 (硬皮). URL: http://cm.bell-labs.com/cm/cs/cbook/ + +The Practice of Programming 作者Brian W. Kernighan和Rob Pike. Addison-Wesley, +Inc., 1999. ISBN 0-201-61586-X. URL: http://cm.bell-labs.com/cm/cs/tpop/ + +cpp,gcc,gcc internals和indent的GNU手册——和K&R及本文相符合的部分,全部可以在 +http://www.gnu.org/manual/找到 + +WG14是C语言的国际标准化工作组,URL: http://www.open-std.org/JTC1/SC22/WG14/ + +Kernel CodingStyle,作者greg@kroah.com发表于OLS 2002: +http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ + +-- +最后更新于2007年7月13日。 diff --git a/Documentation/zh_CN/HOWTO b/Documentation/zh_CN/HOWTO index 48fc67bfbe3d..3d80e8af36ec 100644 --- a/Documentation/zh_CN/HOWTO +++ b/Documentation/zh_CN/HOWTO @@ -1,10 +1,10 @@ Chinese translated version of Documentation/HOWTO If you have any comment or update to the content, please contact the -original document maintainer directly. However, if you have problem +original document maintainer directly. However, if you have a problem communicating in English you can also ask the Chinese maintainer for -help. Contact the Chinese maintainer, if this translation is outdated -or there is problem with translation. +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. Maintainer: Greg Kroah-Hartman <greg@kroah.com> Chinese maintainer: Li Yang <leoli@freescale.com> @@ -85,7 +85,7 @@ Linux内核源代码都是在GPL(通用公共许可证)的保护下发布的 Linux内核代码中包含有大量的文档。这些文档对于学习如何与内核社区互动有着 不可估量的价值。当一个新的功能被加入内核,最好把解释如何使用这个功能的文 档也放进内核。当内核的改动导致面向用户空间的接口发生变化时,最好将相关信 -息或手册页(manpages)的补丁发到mtk-manpages@gmx.net,以向手册页(manpages) +息或手册页(manpages)的补丁发到mtk.manpages@gmail.com,以向手册页(manpages) 的维护者解释这些变化。 以下是内核代码中需要阅读的文档: @@ -218,6 +218,8 @@ kernel.org网站的pub/linux/kernel/v2.6/目录下找到它。它的开发遵循 时,一个新的-rc版本就会被发布。计划是每周都发布新的-rc版本。 - 这个过程一直持续下去直到内核被认为达到足够稳定的状态,持续时间大概是 6个星期。 + - 以下地址跟踪了在每个-rc发布中发现的退步列表: + http://kernelnewbies.org/known_regressions 关于内核发布,值得一提的是Andrew Morton在linux-kernel邮件列表中如是说: “没有人知道新内核何时会被发布,因为发布是根据已知bug的情况来决定 diff --git a/Documentation/zh_CN/SubmittingDrivers b/Documentation/zh_CN/SubmittingDrivers new file mode 100644 index 000000000000..5f4815c63ec7 --- /dev/null +++ b/Documentation/zh_CN/SubmittingDrivers @@ -0,0 +1,168 @@ +Chinese translated version of Documentation/SubmittingDrivers + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Chinese maintainer: Li Yang <leo@zh-kernel.org> +--------------------------------------------------------------------- +Documentation/SubmittingDrivers 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +中文版维护者: 李阳 Li Yang <leo@zh-kernel.org> +中文版翻译者: 李阳 Li Yang <leo@zh-kernel.org> +中文版校译者: 陈琦 Maggie Chen <chenqi@beyondsoft.com> + 王聪 Wang Cong <xiyou.wangcong@gmail.com> + 张巍 Zhang Wei <Wei.Zhang@freescale.com> + +以下为正文 +--------------------------------------------------------------------- + +如何向 Linux 内核提交驱动程序 +----------------------------- + +这篇文档将会解释如何向不同的内核源码树提交设备驱动程序。请注意,如果你感 +兴趣的是显卡驱动程序,你也许应该访问 XFree86 项目(http://www.xfree86.org/) +和/或 X.org 项目 (http://x.org)。 + +另请参阅 Documentation/SubmittingPatches 文档。 + + +分配设备号 +---------- + +块设备和字符设备的主设备号与从设备号是由 Linux 命名编号分配权威 LANANA( +现在是 Torben Mathiasen)负责分配。申请的网址是 http://www.lanana.org/。 +即使不准备提交到主流内核的设备驱动也需要在这里分配设备号。有关详细信息, +请参阅 Documentation/devices.txt。 + +如果你使用的不是已经分配的设备号,那么当你提交设备驱动的时候,它将会被强 +制分配一个新的设备号,即便这个设备号和你之前发给客户的截然不同。 + +设备驱动的提交对象 +------------------ + +Linux 2.0: + 此内核源码树不接受新的驱动程序。 + +Linux 2.2: + 此内核源码树不接受新的驱动程序。 + +Linux 2.4: + 如果所属的代码领域在内核的 MAINTAINERS 文件中列有一个总维护者, + 那么请将驱动程序提交给他。如果此维护者没有回应或者你找不到恰当的 + 维护者,那么请联系 Willy Tarreau <w@1wt.eu>。 + +Linux 2.6: + 除了遵循和 2.4 版内核同样的规则外,你还需要在 linux-kernel 邮件 + 列表上跟踪最新的 API 变化。向 Linux 2.6 内核提交驱动的顶级联系人 + 是 Andrew Morton <akpm@osdl.org>。 + +决定设备驱动能否被接受的条件 +---------------------------- + +许可: 代码必须使用 GNU 通用公开许可证 (GPL) 提交给 Linux,但是 + 我们并不要求 GPL 是唯一的许可。你或许会希望同时使用多种 + 许可证发布,如果希望驱动程序可以被其他开源社区(比如BSD) + 使用。请参考 include/linux/module.h 文件中所列出的可被 + 接受共存的许可。 + +版权: 版权所有者必须同意使用 GPL 许可。最好提交者和版权所有者 + 是相同个人或实体。否则,必需列出授权使用 GPL 的版权所有 + 人或实体,以备验证之需。 + +接口: 如果你的驱动程序使用现成的接口并且和其他同类的驱动程序行 + 为相似,而不是去发明无谓的新接口,那么它将会更容易被接受。 + 如果你需要一个 Linux 和 NT 的通用驱动接口,那么请在用 + 户空间实现它。 + +代码: 请使用 Documentation/CodingStyle 中所描述的 Linux 代码风 + 格。如果你的某些代码段(例如那些与 Windows 驱动程序包共 + 享的代码段)需要使用其他格式,而你却只希望维护一份代码, + 那么请将它们很好地区分出来,并且注明原因。 + +可移植性: 请注意,指针并不永远是 32 位的,不是所有的计算机都使用小 + 尾模式 (little endian) 存储数据,不是所有的人都拥有浮点 + 单元,不要随便在你的驱动程序里嵌入 x86 汇编指令。只能在 + x86 上运行的驱动程序一般是不受欢迎的。虽然你可能只有 x86 + 硬件,很难测试驱动程序在其他平台上是否可用,但是确保代码 + 可以被轻松地移植却是很简单的。 + +清晰度: 做到所有人都能修补这个驱动程序将会很有好处,因为这样你将 + 会直接收到修复的补丁而不是 bug 报告。如果你提交一个试图 + 隐藏硬件工作机理的驱动程序,那么它将会被扔进废纸篓。 + +电源管理: 因为 Linux 正在被很多移动设备和桌面系统使用,所以你的驱 + 动程序也很有可能被使用在这些设备上。它应该支持最基本的电 + 源管理,即在需要的情况下实现系统级休眠和唤醒要用到的 + .suspend 和 .resume 函数。你应该检查你的驱动程序是否能正 + 确地处理休眠与唤醒,如果实在无法确认,请至少把 .suspend + 函数定义成返回 -ENOSYS(功能未实现)错误。你还应该尝试确 + 保你的驱动在什么都不干的情况下将耗电降到最低。要获得驱动 + 程序测试的指导,请参阅 + Documentation/power/drivers-testing.txt。有关驱动程序电 + 源管理问题相对全面的概述,请参阅 + Documentation/power/devices.txt。 + +管理: 如果一个驱动程序的作者还在进行有效的维护,那么通常除了那 + 些明显正确且不需要任何检查的补丁以外,其他所有的补丁都会 + 被转发给作者。如果你希望成为驱动程序的联系人和更新者,最 + 好在代码注释中写明并且在 MAINTAINERS 文件中加入这个驱动 + 程序的条目。 + +不影响设备驱动能否被接受的条件 +------------------------------ + +供应商: 由硬件供应商来维护驱动程序通常是一件好事。不过,如果源码 + 树里已经有其他人提供了可稳定工作的驱动程序,那么请不要期 + 望“我是供应商”会成为内核改用你的驱动程序的理由。理想的情 + 况是:供应商与现有驱动程序的作者合作,构建一个统一完美的 + 驱动程序。 + +作者: 驱动程序是由大的 Linux 公司研发还是由你个人编写,并不影 + 响其是否能被内核接受。没有人对内核源码树享有特权。只要你 + 充分了解内核社区,你就会发现这一点。 + + +资源列表 +-------- + +Linux 内核主源码树: + ftp.??.kernel.org:/pub/linux/kernel/... + ?? == 你的国家代码,例如 "cn"、"us"、"uk"、"fr" 等等 + +Linux 内核邮件列表: + linux-kernel@vger.kernel.org + [可通过向majordomo@vger.kernel.org发邮件来订阅] + +Linux 设备驱动程序,第三版(探讨 2.6.10 版内核): + http://lwn.net/Kernel/LDD3/ (免费版) + +LWN.net: + 每周内核开发活动摘要 - http://lwn.net/ + 2.6 版中 API 的变更: + http://lwn.net/Articles/2.6-kernel-api/ + 将旧版内核的驱动程序移植到 2.6 版: + http://lwn.net/Articles/driver-porting/ + +KernelTrap: + Linux 内核的最新动态以及开发者访谈 + http://kerneltrap.org/ + +内核新手(KernelNewbies): + 为新的内核开发者提供文档和帮助 + http://kernelnewbies.org/ + +Linux USB项目: + http://www.linux-usb.org/ + +写内核驱动的“不要”(Arjan van de Ven著): + http://www.fenrus.org/how-to-not-write-a-device-driver-paper.pdf + +内核清洁工 (Kernel Janitor): + http://janitor.kernelnewbies.org/ diff --git a/Documentation/zh_CN/SubmittingPatches b/Documentation/zh_CN/SubmittingPatches new file mode 100644 index 000000000000..985c92e20b73 --- /dev/null +++ b/Documentation/zh_CN/SubmittingPatches @@ -0,0 +1,416 @@ +Chinese translated version of Documentation/SubmittingPatches + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Chinese maintainer: TripleX Chung <triplex@zh-kernel.org> +--------------------------------------------------------------------- +Documentation/SubmittingPatches 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +中文版维护者: 钟宇 TripleX Chung <triplex@zh-kernel.org> +中文版翻译者: 钟宇 TripleX Chung <triplex@zh-kernel.org> +中文版校译者: 李阳 Li Yang <leo@zh-kernel.org> + 王聪 Wang Cong <xiyou.wangcong@gmail.com> + +以下为正文 +--------------------------------------------------------------------- + + 如何让你的改动进入内核 + 或者 + 获得亲爱的 Linus Torvalds 的关注和处理 +---------------------------------- + +对于想要将改动提交到 Linux 内核的个人或者公司来说,如果不熟悉“规矩”, +提交的流程会让人畏惧。本文档收集了一系列建议,这些建议可以大大的提高你 +的改动被接受的机会。 +阅读 Documentation/SubmitChecklist 来获得在提交代码前需要检查的项目的列 +表。如果你在提交一个驱动程序,那么同时阅读一下 +Documentation/SubmittingDrivers 。 + + +-------------------------- +第一节 - 创建并发送你的改动 +-------------------------- + +1) "diff -up" +----------- + +使用 "diff -up" 或者 "diff -uprN" 来创建补丁。 + +所有内核的改动,都是以补丁的形式呈现的,补丁由 diff(1) 生成。创建补丁的 +时候,要确认它是以 "unified diff" 格式创建的,这种格式由 diff(1) 的 '-u' +参数生成。而且,请使用 '-p' 参数,那样会显示每个改动所在的C函数,使得 +产生的补丁容易读得多。补丁应该基于内核源代码树的根目录,而不是里边的任 +何子目录。 +为一个单独的文件创建补丁,一般来说这样做就够了: + + SRCTREE= linux-2.6 + MYFILE= drivers/net/mydriver.c + + cd $SRCTREE + cp $MYFILE $MYFILE.orig + vi $MYFILE # make your change + cd .. + diff -up $SRCTREE/$MYFILE{.orig,} > /tmp/patch + +为多个文件创建补丁,你可以解开一个没有修改过的内核源代码树,然后和你自 +己的代码树之间做 diff 。例如: + + MYSRC= /devel/linux-2.6 + + tar xvfz linux-2.6.12.tar.gz + mv linux-2.6.12 linux-2.6.12-vanilla + diff -uprN -X linux-2.6.12-vanilla/Documentation/dontdiff \ + linux-2.6.12-vanilla $MYSRC > /tmp/patch + +"dontdiff" 是内核在编译的时候产生的文件的列表,列表中的文件在 diff(1) +产生的补丁里会被跳过。"dontdiff" 文件被包含在2.6.12和之后版本的内核源代 +码树中。对于更早的内核版本,你可以从 +<http://www.xenotime.net/linux/doc/dontdiff> 获取它。 +确定你的补丁里没有包含任何不属于这次补丁提交的额外文件。记得在用diff(1) +生成补丁之后,审阅一次补丁,以确保准确。 +如果你的改动很散乱,你应该研究一下如何将补丁分割成独立的部分,将改动分 +割成一系列合乎逻辑的步骤。这样更容易让其他内核开发者审核,如果你想你的 +补丁被接受,这是很重要的。下面这些脚本能够帮助你做这件事情: +Quilt: +http://savannah.nongnu.org/projects/quilt + +Andrew Morton 的补丁脚本: +http://www.zip.com.au/~akpm/linux/patches/ +作为这些脚本的替代,quilt 是值得推荐的补丁管理工具(看上面的链接)。 + +2)描述你的改动。 +描述你的改动包含的技术细节。 + +要多具体就写多具体。最糟糕的描述可能是像下面这些语句:“更新了某驱动程 +序”,“修正了某驱动程序的bug”,或者“这个补丁包含了某子系统的修改,请 +使用。” + +如果你的描述开始变长,这表示你也许需要拆分你的补丁了,请看第3小节, +继续。 + +3)拆分你的改动 + +将改动拆分,逻辑类似的放到同一个补丁文件里。 + +例如,如果你的改动里同时有bug修正和性能优化,那么把这些改动才分到两个或 +者更多的补丁文件中。如果你的改动包含对API的修改,并且修改了驱动程序来适 +应这些新的API,那么把这些修改分成两个补丁。 + +另一方面,如果你将一个单独的改动做成多个补丁文件,那么将它们合并成一个 +单独的补丁文件。这样一个逻辑上单独的改动只被包含在一个补丁文件里。 + +如果有一个补丁依赖另外一个补丁来完成它的改动,那没问题。简单的在你的补 +丁描述里指出“这个补丁依赖某补丁”就好了。 + +如果你不能将补丁浓缩成更少的文件,那么每次大约发送出15个,然后等待审查 +和整合。 + +4)选择 e-mail 的收件人 + +看一遍 MAINTAINERS 文件和源代码,看看你所的改动所在的内核子系统有没有指 +定的维护者。如果有,给他们发e-mail。 + +如果没有找到维护者,或者维护者没有反馈,将你的补丁发送到内核开发者主邮 +件列表 linux-kernel@vger.kernel.org。大部分的内核开发者都跟踪这个邮件列 +表,可以评价你的改动。 + +每次不要发送超过15个补丁到 vger 邮件列表!!! + +Linus Torvalds 是决定改动能否进入 Linux 内核的最终裁决者。他的 e-mail +地址是 <torvalds@linux-foundation.org> 。他收到的 e-mail 很多,所以一般 +的说,最好别给他发 e-mail。 + +那些修正bug,“显而易见”的修改或者是类似的只需要很少讨论的补丁可以直接 +发送或者CC给Linus。那些需要讨论或者没有很清楚的好处的补丁,一般先发送到 +linux-kernel邮件列表。只有当补丁被讨论得差不多了,才提交给Linus。 + +5)选择CC( e-mail 抄送)列表 + +除非你有理由不这样做,否则CC linux-kernel@vger.kernel.org。 + +除了 Linus 之外,其他内核开发者也需要注意到你的改动,这样他们才能评论你 +的改动并提供代码审查和建议。linux-kernel 是 Linux 内核开发者主邮件列表 +。其它的邮件列表为特定的子系统提供服务,比如 USB,framebuffer 设备,虚 +拟文件系统,SCSI 子系统,等等。查看 MAINTAINERS 文件来获得和你的改动有 +关的邮件列表。 + +Majordomo lists of VGER.KERNEL.ORG at: + <http://vger.kernel.org/vger-lists.html> + +如果改动影响了用户空间和内核之间的接口,请给 MAN-PAGES 的维护者(列在 +MAITAINERS 文件里的)发送一个手册页(man-pages)补丁,或者至少通知一下改 +变,让一些信息有途径进入手册页。 + +即使在第四步的时候,维护者没有作出回应,也要确认在修改他们的代码的时候 +,一直将维护者拷贝到CC列表中。 + +对于小的补丁,你也许会CC到 Adrian Bunk 管理的搜集琐碎补丁的邮件列表 +(Trivial Patch Monkey)trivial@kernel.org,那里专门收集琐碎的补丁。下面这样 +的补丁会被看作“琐碎的”补丁: + 文档的拼写修正。 + 修正会影响到 grep(1) 的拼写。 + 警告信息修正(频繁的打印无用的警告是不好的。) + 编译错误修正(代码逻辑的确是对的,只是编译有问题。) + 运行时修正(只要真的修正了错误。) + 移除使用了被废弃的函数/宏的代码(例如 check_region。) + 联系方式和文档修正。 + 用可移植的代码替换不可移植的代码(即使在体系结构相关的代码中,既然有 + 人拷贝,只要它是琐碎的) + 任何文件的作者/维护者对该文件的改动(例如 patch monkey 在重传模式下) + +URL: <http://www.kernel.org/pub/linux/kernel/people/bunk/trivial/> + +(译注,关于“琐碎补丁”的一些说明:因为原文的这一部分写得比较简单,所以不得不 +违例写一下译注。"trivial"这个英文单词的本意是“琐碎的,不重要的。”但是在这里 +有稍微有一些变化,例如对一些明显的NULL指针的修正,属于运行时修正,会被归类 +到琐碎补丁里。虽然NULL指针的修正很重要,但是这样的修正往往很小而且很容易得到 +检验,所以也被归入琐碎补丁。琐碎补丁更精确的归类应该是 +“simple, localized & easy to verify”,也就是说简单的,局部的和易于检验的。 +trivial@kernel.org邮件列表的目的是针对这样的补丁,为提交者提供一个中心,来 +降低提交的门槛。) + +6)没有 MIME 编码,没有链接,没有压缩,没有附件,只有纯文本。 + +Linus 和其他的内核开发者需要阅读和评论你提交的改动。对于内核开发者来说 +,可以“引用”你的改动很重要,使用一般的 e-mail 工具,他们就可以在你的 +代码的任何位置添加评论。 + +因为这个原因,所有的提交的补丁都是 e-mail 中“内嵌”的。 +警告:如果你使用剪切-粘贴你的补丁,小心你的编辑器的自动换行功能破坏你的 +补丁。 + +不要将补丁作为 MIME 编码的附件,不管是否压缩。很多流行的 e-mail 软件不 +是任何时候都将 MIME 编码的附件当作纯文本发送的,这会使得别人无法在你的 +代码中加评论。另外,MIME 编码的附件会让 Linus 多花一点时间来处理,这就 +降低了你的改动被接受的可能性。 + +警告:一些邮件软件,比如 Mozilla 会将你的信息以如下格式发送: +---- 邮件头 ---- +Content-Type: text/plain; charset=us-ascii; format=flowed +---- 邮件头 ---- +问题在于 “format=flowed” 会让接收端的某些邮件软件将邮件中的制表符替换 +成空格以及做一些类似的替换。这样,你发送的时候看起来没问题的补丁就被破 +坏了。 + +要修正这个问题,只需要将你的 mozilla 的 defaults/pref/mailnews.js 文件 +里的 +pref("mailnews.send_plaintext_flowed", false); // RFC 2646======= +修改成 +pref("mailnews.display.disable_format_flowed_support", true); +就可以了。 + +7) e-mail 的大小 + +给 Linus 发送补丁的时候,永远按照第6小节说的做。 + +大的改动对邮件列表不合适,对某些维护者也不合适。如果你的补丁,在不压缩 +的情况下,超过了40kB,那么你最好将补丁放在一个能通过 internet 访问的服 +务器上,然后用指向你的补丁的 URL 替代。 + +8) 指出你的内核版本 + +在标题和在补丁的描述中,指出补丁对应的内核的版本,是很重要的。 + +如果补丁不能干净的在最新版本的内核上打上,Linus 是不会接受它的。 + +9) 不要气馁,继续提交。 + +当你提交了改动以后,耐心地等待。如果 Linus 喜欢你的改动并且同意它,那么 +它将在下一个内核发布版本中出现。 + +然而,如果你的改动没有出现在下一个版本的内核中,可能有若干原因。减少那 +些原因,修正错误,重新提交更新后的改动,是你自己的工作。 + +Linus不给出任何评论就“丢弃”你的补丁是常见的事情。在系统中这样的事情很 +平常。如果他没有接受你的补丁,也许是由于以下原本: +* 你的补丁不能在最新版本的内核上干净的打上。 +* 你的补丁在 linux-kernel 邮件列表中没有得到充分的讨论。 +* 风格问题(参照第2小节) +* 邮件格式问题(重读本节) +* 你的改动有技术问题。 +* 他收到了成吨的 e-mail,而你的在混乱中丢失了。 +* 你让人为难。 + +有疑问的时候,在 linux-kernel 邮件列表上请求评论。 + +10) 在标题上加上 PATCH 的字样 + +Linus 和 linux-kernel 邮件列表的 e-mail 流量都很高,一个通常的约定是标 +题行以 [PATCH] 开头。这样可以让 Linus 和其他内核开发人员可以从 e-mail +的讨论中很轻易的将补丁分辨出来。 + +11)为你的工作签名 + +为了加强对谁做了何事的追踪,尤其是对那些透过好几层的维护者的补丁,我们 +建议在发送出去的补丁上加一个 “sign-off” 的过程。 + +"sign-off" 是在补丁的注释的最后的简单的一行文字,认证你编写了它或者其他 +人有权力将它作为开放源代码的补丁传递。规则很简单:如果你能认证如下信息 +: + 开发者来源证书 1.1 + 对于本项目的贡献,我认证如下信息: + (a)这些贡献是完全或者部分的由我创建,我有权利以文件中指出 + 的开放源代码许可证提交它;或者 + (b)这些贡献基于以前的工作,据我所知,这些以前的工作受恰当的开放 + 源代码许可证保护,而且,根据许可证,我有权提交修改后的贡献, + 无论是完全还是部分由我创造,这些贡献都使用同一个开放源代码许可证 + (除非我被允许用其它的许可证),正如文件中指出的;或者 + (c)这些贡献由认证(a),(b)或者(c)的人直接提供给我,而 + 且我没有修改它。 + (d)我理解并同意这个项目和贡献是公开的,贡献的记录(包括我 + 一起提交的个人记录,包括 sign-off )被永久维护并且可以和这个项目 + 或者开放源代码的许可证同步地再发行。 + 那么加入这样一行: + Signed-off-by: Random J Developer <random@developer.example.org> + +使用你的真名(抱歉,不能使用假名或者匿名。) + +有人在最后加上标签。现在这些东西会被忽略,但是你可以这样做,来标记公司 +内部的过程,或者只是指出关于 sign-off 的一些特殊细节。 + +12)标准补丁格式 + +标准的补丁,标题行是: + Subject: [PATCH 001/123] 子系统:一句话概述 + +标准补丁的信体存在如下部分: + + - 一个 "from" 行指出补丁作者。 + + - 一个空行 + + - 说明的主体,这些说明文字会被拷贝到描述该补丁的永久改动记录里。 + + - 一个由"---"构成的标记行 + + - 不合适放到改动记录里的额外的注解。 + + - 补丁本身(diff 输出) + +标题行的格式,使得对标题行按字母序排序非常的容易 - 很多 e-mail 客户端都 +可以支持 - 因为序列号是用零填充的,所以按数字排序和按字母排序是一样的。 + +e-mail 标题中的“子系统”标识哪个内核子系统将被打补丁。 + +e-mail 标题中的“一句话概述”扼要的描述 e-mail 中的补丁。“一句话概述” +不应该是一个文件名。对于一个补丁系列(“补丁系列”指一系列的多个相关补 +丁),不要对每个补丁都使用同样的“一句话概述”。 + +记住 e-mail 的“一句话概述”会成为该补丁的全局唯一标识。它会蔓延到 git +的改动记录里。然后“一句话概述”会被用在开发者的讨论里,用来指代这个补 +丁。用户将希望通过 google 来搜索"一句话概述"来找到那些讨论这个补丁的文 +章。 + +一些标题的例子: + + Subject: [patch 2/5] ext2: improve scalability of bitmap searching + Subject: [PATCHv2 001/207] x86: fix eflags tracking + +"from" 行是信体里的最上面一行,具有如下格式: + From: Original Author <author@example.com> + +"from" 行指明在永久改动日志里,谁会被确认为作者。如果没有 "from" 行,那 +么邮件头里的 "From: " 行会被用来决定改动日志中的作者。 + +说明的主题将会被提交到永久的源代码改动日志里,因此对那些早已经不记得和 +这个补丁相关的讨论细节的有能力的读者来说,是有意义的。 + +"---" 标记行对于补丁处理工具要找到哪里是改动日志信息的结束,是不可缺少 +的。 + +对于 "---" 标记之后的额外注解,一个好的用途就是用来写 diffstat,用来显 +示修改了什么文件和每个文件都增加和删除了多少行。diffstat 对于比较大的补 +丁特别有用。其余那些只是和时刻或者开发者相关的注解,不合适放到永久的改 +动日志里的,也应该放这里。 +使用 diffstat的选项 "-p 1 -w 70" 这样文件名就会从内核源代码树的目录开始 +,不会占用太宽的空间(很容易适合80列的宽度,也许会有一些缩进。) + +在后面的参考资料中能看到适当的补丁格式的更多细节。 + +------------------------------- +第二节 提示,建议和诀窍 +------------------------------- + +本节包含很多和提交到内核的代码有关的通常的"规则"。事情永远有例外...但是 +你必须真的有好的理由这样做。你可以把本节叫做Linus的计算机科学入门课。 + +1) 读 Document/CodingStyle + +Nuff 说过,如果你的代码和这个偏离太多,那么它有可能会被拒绝,没有更多的 +审查,没有更多的评价。 + +2) #ifdef 是丑陋的 +混杂了 ifdef 的代码难以阅读和维护。别这样做。作为替代,将你的 ifdef 放 +在头文件里,有条件地定义 "static inline" 函数,或者宏,在代码里用这些东 +西。让编译器把那些"空操作"优化掉。 + +一个简单的例子,不好的代码: + + dev = alloc_etherdev (sizeof(struct funky_private)); + if (!dev) + return -ENODEV; + #ifdef CONFIG_NET_FUNKINESS + init_funky_net(dev); + #endif + +清理后的例子: + +(头文件里) + #ifndef CONFIG_NET_FUNKINESS + static inline void init_funky_net (struct net_device *d) {} + #endif + +(代码文件里) + dev = alloc_etherdev (sizeof(struct funky_private)); + if (!dev) + return -ENODEV; + init_funky_net(dev); + +3) 'static inline' 比宏好 + +Static inline 函数相比宏来说,是好得多的选择。Static inline 函数提供了 +类型安全,没有长度限制,没有格式限制,在 gcc 下开销和宏一样小。 + +宏只在 static inline 函数不是最优的时候[在 fast paths 里有很少的独立的 +案例],或者不可能用 static inline 函数的时候[例如字符串分配]。 +应该用 'static inline' 而不是 'static __inline__', 'extern inline' 和 +'extern __inline__' 。 + +4) 不要过度设计 + +不要试图预计模糊的未来事情,这些事情也许有用也许没有用:"让事情尽可能的 +简单,而不是更简单"。 + +---------------- +第三节 参考文献 +---------------- + +Andrew Morton, "The perfect patch" (tpp). + <http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt> + +Jeff Garzik, "Linux kernel patch submission format". + <http://linux.yyz.us/patch-format.html> + +Greg Kroah-Hartman, "How to piss off a kernel subsystem maintainer". + <http://www.kroah.com/log/2005/03/31/> + <http://www.kroah.com/log/2005/07/08/> + <http://www.kroah.com/log/2005/10/19/> + <http://www.kroah.com/log/2006/01/11/> + +NO!!!! No more huge patch bombs to linux-kernel@vger.kernel.org people! + <http://marc.theaimsgroup.com/?l=linux-kernel&m=112112749912944&w=2> + +Kernel Documentation/CodingStyle: + <http://sosdg.org/~coywolf/lxr/source/Documentation/CodingStyle> + +Linus Torvalds's mail on the canonical patch format: + <http://lkml.org/lkml/2005/4/7/183> +-- diff --git a/Documentation/zh_CN/oops-tracing.txt b/Documentation/zh_CN/oops-tracing.txt new file mode 100644 index 000000000000..9312608ffb8d --- /dev/null +++ b/Documentation/zh_CN/oops-tracing.txt @@ -0,0 +1,212 @@ +Chinese translated version of Documentation/oops-tracing.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Chinese maintainer: Dave Young <hidave.darkstar@gmail.com> +--------------------------------------------------------------------- +Documentation/oops-tracing.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +中文版维护者: 杨瑞 Dave Young <hidave.darkstar@gmail.com> +中文版翻译者: 杨瑞 Dave Young <hidave.darkstar@gmail.com> +中文版校译者: 李阳 Li Yang <leo@zh-kernel.org> + 王聪 Wang Cong <xiyou.wangcong@gmail.com> + +以下为正文 +--------------------------------------------------------------------- + +注意: ksymoops 在2.6中是没有用的。 请以原有格式使用Oops(来自dmesg,等等)。 +忽略任何这样那样关于“解码Oops”或者“通过ksymoops运行”的文档。 如果你贴出运行过 +ksymoops的来自2.6的Oops,人们只会让你重贴一次。 + +快速总结 +------------- + +发现Oops并发送给看似相关的内核领域的维护者。别太担心对不上号。如果你不确定就发给 +和你所做的事情相关的代码的负责人。 如果可重现试着描述怎样重构。 那甚至比oops更有 +价值。 + +如果你对于发送给谁一无所知, 发给linux-kernel@vger.kernel.org。感谢你帮助Linux +尽可能地稳定。 + +Oops在哪里? +---------------------- + +通常Oops文本由klogd从内核缓冲区里读取并传给syslogd,由syslogd写到syslog文件中, +典型地是/var/log/messages(依赖于/etc/syslog.conf)。有时klogd崩溃了,这种情况下你 +能够运行dmesg > file来从内核缓冲区中读取数据并保存下来。 否则你可以 +cat /proc/kmsg > file, 然而你必须介入中止传输, kmsg是一个“永不结束的文件”。如 +果机器崩溃坏到你不能输入命令或者磁盘不可用那么你有三种选择:- + +(1) 手抄屏幕上的文本待机器重启后再输入计算机。 麻烦但如果没有针对崩溃的准备, +这是仅有的选择。 另外,你可以用数码相机把屏幕拍下来-不太好,但比没有强。 如果信 +息滚动到了终端的上面,你会发现以高分辩率启动(比如,vga=791)会让你读到更多的文 +本。(注意:这需要vesafb,所以对‘早期’的oops没有帮助) + +(2)用串口终端启动(请参看Documentation/serial-console.txt),运行一个null +modem到另一台机器并用你喜欢的通讯工具获取输出。Minicom工作地很好。 + +(3)使用Kdump(请参看Documentation/kdump/kdump.txt), +使用在Documentation/kdump/gdbmacros.txt中定义的dmesg gdb宏,从旧的内存中提取内核 +环形缓冲区。 + +完整信息 +---------------- + +注意:以下来自于Linus的邮件适用于2.4内核。 我因为历史原因保留了它,并且因为其中 +一些信息仍然适用。 特别注意的是,请忽略任何ksymoops的引用。 + +From: Linus Torvalds <torvalds@osdl.org> + +怎样跟踪Oops.. [原发到linux-kernel的一封邮件] + +主要的窍门是有五年和这些烦人的oops消息打交道的经验;-) + +实际上,你有办法使它更简单。我有两个不同的方法: + + gdb /usr/src/linux/vmlinux + gdb> disassemble <offending_function> + +那是发现问题的简单办法,至少如果bug报告做的好的情况下(象这个一样-运行ksymoops +得到oops发生的函数及函数内的偏移)。 + +哦,如果报告发生的内核以相同的编译器和相似的配置编译它会有帮助的。 + +另一件要做的事是反汇编bug报告的“Code”部分:ksymoops也会用正确的工具来做这件事, +但如果没有那些工具你可以写一个傻程序: + + char str[] = "\xXX\xXX\xXX..."; + main(){} + +并用gcc -g编译它然后执行“disassemble str”(XX部分是由Oops报告的值-你可以仅剪切 +粘贴并用“\x”替换空格-我就是这么做的,因为我懒得写程序自动做这一切)。 + +另外,你可以用scripts/decodecode这个shell脚本。它的使用方法是: +decodecode < oops.txt + +“Code”之后的十六进制字节可能(在某些架构上)有一些当前指令之前的指令字节以及 +当前和之后的指令字节 + +Code: f9 0f 8d f9 00 00 00 8d 42 0c e8 dd 26 11 c7 a1 60 ea 2b f9 8b 50 08 a1 +64 ea 2b f9 8d 34 82 8b 1e 85 db 74 6d 8b 15 60 ea 2b f9 <8b> 43 04 39 42 54 +7e 04 40 89 42 54 8b 43 04 3b 05 00 f6 52 c0 + +最后,如果你想知道代码来自哪里,你可以: + + cd /usr/src/linux + make fs/buffer.s # 或任何产生BUG的文件 + +然后你会比gdb反汇编更清楚的知道发生了什么。 + +现在,问题是把你所拥有的所有数据结合起来:C源码(关于它应该怎样的一般知识), +汇编代码及其反汇编得到的代码(另外还有从“oops”消息得到的寄存器状态-对了解毁坏的 +指针有用,而且当你有了汇编代码你也能拿其它的寄存器和任何它们对应的C表达式做匹配 +)。 + +实际上,你仅需看看哪里不匹配(这个例子是“Code”反汇编和编译器生成的代码不匹配)。 +然后你须要找出为什么不匹配。通常很简单-你看到代码使用了空指针然后你看代码想知道 +空指针是怎么出现的,还有检查它是否合法.. + +现在,如果明白这是一项耗时的工作而且需要一丁点儿的专心,没错。这就是我为什么大多 +只是忽略那些没有符号表信息的崩溃报告的原因:简单的说太难查找了(我有一些 +程序用于在内核代码段中搜索特定的模式,而且有时我也已经能找出那些崩溃的地方,但是 +仅仅是找出正确的序列也确实需要相当扎实的内核知识) + +_有时_会发生这种情况,我仅看到崩溃中的反汇编代码序列, 然后我马上就明白问题出在 +哪里。这时我才意识到自己干这个工作已经太长时间了;-) + + Linus + + +--------------------------------------------------------------------------- +关于Oops跟踪的注解: + +为了帮助Linus和其它内核开发者,klogd纳入了大量的支持来处理保护错误。为了拥有对 +地址解析的完整支持至少应该使用1.3-pl3的sysklogd包。 + +当保护错误发生时,klogd守护进程自动把内核日志信息中的重要地址翻译成它们相应的符 +号。 + +klogd执行两种类型的地址解析。首先是静态翻译其次是动态翻译。静态翻译和ksymoops +一样使用System.map文件。为了做静态翻译klogd守护进程必须在初始化时能找到system +map文件。关于klogd怎样搜索map文件请参看klogd手册页。 + +动态地址翻译在使用内核可装载模块时很重要。 因为内核模块的内存是从内核动态内存池 +里分配的,所以不管是模块开始位置还是模块中函数和符号的位置都不是固定的。 + +内核支持允许程序决定装载哪些模块和它们在内存中位置的系统调用。使用这些系统调用 +klogd守护进程生成一张符号表用于调试发生在可装载模块中的保护错误。 + +至少klogd会提供产生保护错误的模块名。还可有额外的符号信息供可装载模块开发者选择 +以从模块中输出符号信息。 + +因为内核模块环境可能是动态的,所以必须有一种机制当模块环境发生改变时来通知klogd +守护进程。 有一些可用的命令行选项允许klogd向当前执行中的守护进程发送信号,告知符 +号信息应该被刷新了。 更多信息请参看klogd手册页。 + +sysklogd发布时包含一个补丁修改了modules-2.0.0包,无论何时一个模块装载或者卸载都 +会自动向klogd发送信号。打上这个补丁提供了必要的对调试发生于内核可装载模块的保护 +错误的无缝支持。 + +以下是被klogd处理过的发生在可装载模块中的一个保护错误例子: +--------------------------------------------------------------------------- +Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc +Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000 +Aug 29 09:51:01 blizard kernel: *pde = 00000000 +Aug 29 09:51:01 blizard kernel: Oops: 0002 +Aug 29 09:51:01 blizard kernel: CPU: 0 +Aug 29 09:51:01 blizard kernel: EIP: 0010:[oops:_oops+16/3868] +Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212 +Aug 29 09:51:01 blizard kernel: eax: 315e97cc ebx: 003a6f80 ecx: 001be77b edx: 00237c0c +Aug 29 09:51:01 blizard kernel: esi: 00000000 edi: bffffdb3 ebp: 00589f90 esp: 00589f8c +Aug 29 09:51:01 blizard kernel: ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018 +Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000) +Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001 +Aug 29 09:51:01 blizard kernel: 00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00 +Aug 29 09:51:01 blizard kernel: bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036 +Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128] +Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3 +--------------------------------------------------------------------------- + +Dr. G.W. Wettstein Oncology Research Div. Computing Facility +Roger Maris Cancer Center INTERNET: greg@wind.rmcc.com +820 4th St. N. +Fargo, ND 58122 +Phone: 701-234-7556 + + +--------------------------------------------------------------------------- +受污染的内核 + +一些oops报告在程序记数器之后包含字符串'Tainted: '。这表明内核已经被一些东西给污 +染了。 该字符串之后紧跟着一系列的位置敏感的字符,每个代表一个特定的污染值。 + + 1:'G'如果所有装载的模块都有GPL或相容的许可证,'P'如果装载了任何的专有模块。 +没有模块MODULE_LICENSE或者带有insmod认为是与GPL不相容的的MODULE_LICENSE的模块被 +认定是专有的。 + + 2:'F'如果有任何通过“insmod -f”被强制装载的模块,' '如果所有模块都被正常装载。 + + 3:'S'如果oops发生在SMP内核中,运行于没有证明安全运行多处理器的硬件。 当前这种 +情况仅限于几种不支持SMP的速龙处理器。 + + 4:'R'如果模块通过“insmod -f”被强制装载,' '如果所有模块都被正常装载。 + + 5:'M'如果任何处理器报告了机器检查异常,' '如果没有发生机器检查异常。 + + 6:'B'如果页释放函数发现了一个错误的页引用或者一些非预期的页标志。 + + 7:'U'如果用户或者用户应用程序特别请求设置污染标志,否则' '。 + + 8:'D'如果内核刚刚死掉,比如有OOPS或者BUG。 + +使用'Tainted: '字符串的主要原因是要告诉内核调试者,这是否是一个干净的内核亦或发 +生了任何的不正常的事。污染是永久的:即使出错的模块已经被卸载了,污染值仍然存在, +以表明内核不再值得信任。 diff --git a/Documentation/zh_CN/sparse.txt b/Documentation/zh_CN/sparse.txt new file mode 100644 index 000000000000..75992a603ae3 --- /dev/null +++ b/Documentation/zh_CN/sparse.txt @@ -0,0 +1,100 @@ +Chinese translated version of Documentation/sparse.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Chinese maintainer: Li Yang <leo@zh-kernel.org> +--------------------------------------------------------------------- +Documentation/sparse.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +中文版维护者: 李阳 Li Yang <leo@zh-kernel.org> +中文版翻译者: 李阳 Li Yang <leo@zh-kernel.org> + + +以下为正文 +--------------------------------------------------------------------- + +Copyright 2004 Linus Torvalds +Copyright 2004 Pavel Machek <pavel@suse.cz> +Copyright 2006 Bob Copeland <me@bobcopeland.com> + +使用 sparse 工具做类型检查 +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +"__bitwise" 是一种类型属性,所以你应该这样使用它: + + typedef int __bitwise pm_request_t; + + enum pm_request { + PM_SUSPEND = (__force pm_request_t) 1, + PM_RESUME = (__force pm_request_t) 2 + }; + +这样会使 PM_SUSPEND 和 PM_RESUME 成为位方式(bitwise)整数(使用"__force" +是因为 sparse 会抱怨改变位方式的类型转换,但是这里我们确实需要强制进行转 +换)。而且因为所有枚举值都使用了相同的类型,这里的"enum pm_request"也将 +会使用那个类型做为底层实现。 + +而且使用 gcc 编译的时候,所有的 __bitwise/__force 都会消失,最后在 gcc +看来它们只不过是普通的整数。 + +坦白来说,你并不需要使用枚举类型。上面那些实际都可以浓缩成一个特殊的"int +__bitwise"类型。 + +所以更简单的办法只要这样做: + + typedef int __bitwise pm_request_t; + + #define PM_SUSPEND ((__force pm_request_t) 1) + #define PM_RESUME ((__force pm_request_t) 2) + +现在你就有了严格的类型检查所需要的所有基础架构。 + +一个小提醒:常数整数"0"是特殊的。你可以直接把常数零当作位方式整数使用而 +不用担心 sparse 会抱怨。这是因为"bitwise"(恰如其名)是用来确保不同位方 +式类型不会被弄混(小尾模式,大尾模式,cpu尾模式,或者其他),对他们来说 +常数"0"确实是特殊的。 + +获取 sparse 工具 +~~~~~~~~~~~~~~~~ + +你可以从 Sparse 的主页获取最新的发布版本: + + http://www.kernel.org/pub/linux/kernel/people/josh/sparse/ + +或者,你也可以使用 git 克隆最新的 sparse 开发版本: + + git://git.kernel.org/pub/scm/linux/kernel/git/josh/sparse.git + +DaveJ 把每小时自动生成的 git 源码树 tar 包放在以下地址: + + http://www.codemonkey.org.uk/projects/git-snapshots/sparse/ + +一旦你下载了源码,只要以普通用户身份运行: + + make + make install + +它将会被自动安装到你的 ~/bin 目录下。 + +使用 sparse 工具 +~~~~~~~~~~~~~~~~ + +用"make C=1"命令来编译内核,会对所有重新编译的 C 文件使用 sparse 工具。 +或者使用"make C=2"命令,无论文件是否被重新编译都会对其使用 sparse 工具。 +如果你已经编译了内核,用后一种方式可以很快地检查整个源码树。 + +make 的可选变量 CHECKFLAGS 可以用来向 sparse 工具传递参数。编译系统会自 +动向 sparse 工具传递 -Wbitwise 参数。你可以定义 __CHECK_ENDIAN__ 来进行 +大小尾检查。 + + make C=2 CHECKFLAGS="-D__CHECK_ENDIAN__" + +这些检查默认都是被关闭的,因为他们通常会产生大量的警告。 diff --git a/Documentation/zh_CN/stable_kernel_rules.txt b/Documentation/zh_CN/stable_kernel_rules.txt new file mode 100644 index 000000000000..b5b9b0ab02fd --- /dev/null +++ b/Documentation/zh_CN/stable_kernel_rules.txt @@ -0,0 +1,66 @@ +Chinese translated version of Documentation/stable_kernel_rules.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Chinese maintainer: TripleX Chung <triplex@zh-kernel.org> +--------------------------------------------------------------------- +Documentation/stable_kernel_rules.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + + +中文版维护者: 钟宇 TripleX Chung <triplex@zh-kernel.org> +中文版翻译者: 钟宇 TripleX Chung <triplex@zh-kernel.org> +中文版校译者: 李阳 Li Yang <leo@zh-kernel.org> + Kangkai Yin <e12051@motorola.com> + +以下为正文 +--------------------------------------------------------------------- + +关于Linux 2.6稳定版发布,所有你想知道的事情。 + +关于哪些类型的补丁可以被接收进入稳定版代码树,哪些不可以的规则: + + - 必须是显而易见的正确,并且经过测试的。 + - 连同上下文,不能大于100行。 + - 必须只修正一件事情。 + - 必须修正了一个给大家带来麻烦的真正的bug(不是“这也许是一个问题...” + 那样的东西)。 + - 必须修正带来如下后果的问题:编译错误(对被标记为CONFIG_BROKEN的例外), + 内核崩溃,挂起,数据损坏,真正的安全问题,或者一些类似“哦,这不 + 好”的问题。简短的说,就是一些致命的问题。 + - 没有“理论上的竞争条件”,除非能给出竞争条件如何被利用的解释。 + - 不能存在任何的“琐碎的”修正(拼写修正,去掉多余空格之类的)。 + - 必须被相关子系统的维护者接受。 + - 必须遵循Documentation/SubmittingPatches里的规则。 + +向稳定版代码树提交补丁的过程: + + - 在确认了补丁符合以上的规则后,将补丁发送到stable@kernel.org。 + - 如果补丁被接受到队列里,发送者会收到一个ACK回复,如果没有被接受,收 + 到的是NAK回复。回复需要几天的时间,这取决于开发者的时间安排。 + - 被接受的补丁会被加到稳定版本队列里,等待其他开发者的审查。 + - 安全方面的补丁不要发到这个列表,应该发送到security@kernel.org。 + +审查周期: + + - 当稳定版的维护者决定开始一个审查周期,补丁将被发送到审查委员会,以 + 及被补丁影响的领域的维护者(除非提交者就是该领域的维护者)并且抄送 + 到linux-kernel邮件列表。 + - 审查委员会有48小时的时间,用来决定给该补丁回复ACK还是NAK。 + - 如果委员会中有成员拒绝这个补丁,或者linux-kernel列表上有人反对这个 + 补丁,并提出维护者和审查委员会之前没有意识到的问题,补丁会从队列中 + 丢弃。 + - 在审查周期结束的时候,那些得到ACK回应的补丁将会被加入到最新的稳定版 + 发布中,一个新的稳定版发布就此产生。 + - 安全性补丁将从内核安全小组那里直接接收到稳定版代码树中,而不是通过 + 通常的审查周期。请联系内核安全小组以获得关于这个过程的更多细节。 + +审查委员会: + - 由一些自愿承担这项任务的内核开发者,和几个非志愿的组成。 diff --git a/Documentation/zh_CN/volatile-considered-harmful.txt b/Documentation/zh_CN/volatile-considered-harmful.txt new file mode 100644 index 000000000000..ba8149d2233a --- /dev/null +++ b/Documentation/zh_CN/volatile-considered-harmful.txt @@ -0,0 +1,113 @@ +Chinese translated version of Documentation/volatile-considered-harmful.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have a problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer if this translation is outdated +or if there is a problem with the translation. + +Maintainer: Jonathan Corbet <corbet@lwn.net> +Chinese maintainer: Bryan Wu <bryan.wu@analog.com> +--------------------------------------------------------------------- +Documentation/volatile-considered-harmful.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +英文版维护者: Jonathan Corbet <corbet@lwn.net> +中文版维护者: 伍鹏 Bryan Wu <bryan.wu@analog.com> +中文版翻译者: 伍鹏 Bryan Wu <bryan.wu@analog.com> +中文版校译者: 张汉辉 Eugene Teo <eugeneteo@kernel.sg> + 杨瑞 Dave Young <hidave.darkstar@gmail.com> +以下为正文 +--------------------------------------------------------------------- + +为什么不应该使用“volatile”类型 +------------------------------ + +C程序员通常认为volatile表示某个变量可以在当前执行的线程之外被改变;因此,在内核 +中用到共享数据结构时,常常会有C程序员喜欢使用volatile这类变量。换句话说,他们经 +常会把volatile类型看成某种简易的原子变量,当然它们不是。在内核中使用volatile几 +乎总是错误的;本文档将解释为什么这样。 + +理解volatile的关键是知道它的目的是用来消除优化,实际上很少有人真正需要这样的应 +用。在内核中,程序员必须防止意外的并发访问破坏共享的数据结构,这其实是一个完全 +不同的任务。用来防止意外并发访问的保护措施,可以更加高效的避免大多数优化相关的 +问题。 + +像volatile一样,内核提供了很多原语来保证并发访问时的数据安全(自旋锁, 互斥量,内 +存屏障等等),同样可以防止意外的优化。如果可以正确使用这些内核原语,那么就没有 +必要再使用volatile。如果仍然必须使用volatile,那么几乎可以肯定在代码的某处有一 +个bug。在正确设计的内核代码中,volatile能带来的仅仅是使事情变慢。 + +思考一下这段典型的内核代码: + + spin_lock(&the_lock); + do_something_on(&shared_data); + do_something_else_with(&shared_data); + spin_unlock(&the_lock); + +如果所有的代码都遵循加锁规则,当持有the_lock的时候,不可能意外的改变shared_data的 +值。任何可能访问该数据的其他代码都会在这个锁上等待。自旋锁原语跟内存屏障一样—— 它 +们显式的用来书写成这样 —— 意味着数据访问不会跨越它们而被优化。所以本来编译器认为 +它知道在shared_data里面将有什么,但是因为spin_lock()调用跟内存屏障一样,会强制编 +译器忘记它所知道的一切。那么在访问这些数据时不会有优化的问题。 + +如果shared_data被声名为volatile,锁操作将仍然是必须的。就算我们知道没有其他人正在 +使用它,编译器也将被阻止优化对临界区内shared_data的访问。在锁有效的同时, +shared_data不是volatile的。在处理共享数据的时候,适当的锁操作可以不再需要 +volatile —— 并且是有潜在危害的。 + +volatile的存储类型最初是为那些内存映射的I/O寄存器而定义。在内核里,寄存器访问也应 +该被锁保护,但是人们也不希望编译器“优化”临界区内的寄存器访问。内核里I/O的内存访问 +是通过访问函数完成的;不赞成通过指针对I/O内存的直接访问,并且不是在所有体系架构上 +都能工作。那些访问函数正是为了防止意外优化而写的,因此,再说一次,volatile类型不 +是必需的。 + +另一种引起用户可能使用volatile的情况是当处理器正忙着等待一个变量的值。正确执行一 +个忙等待的方法是: + + while (my_variable != what_i_want) + cpu_relax(); + +cpu_relax()调用会降低CPU的能量消耗或者让位于超线程双处理器;它也作为内存屏障一样出 +现,所以,再一次,volatile不是必需的。当然,忙等待一开始就是一种反常规的做法。 + +在内核中,一些稀少的情况下volatile仍然是有意义的: + + - 在一些体系架构的系统上,允许直接的I/0内存访问,那么前面提到的访问函数可以使用 + volatile。基本上,每一个访问函数调用它自己都是一个小的临界区域并且保证了按照 + 程序员期望的那样发生访问操作。 + + - 某些会改变内存的内联汇编代码虽然没有什么其他明显的附作用,但是有被GCC删除的可 + 能性。在汇编声明中加上volatile关键字可以防止这种删除操作。 + + - Jiffies变量是一种特殊情况,虽然每次引用它的时候都可以有不同的值,但读jiffies + 变量时不需要任何特殊的加锁保护。所以jiffies变量可以使用volatile,但是不赞成 + 其他跟jiffies相同类型变量使用volatile。Jiffies被认为是一种“愚蠢的遗留物" + (Linus的话)因为解决这个问题比保持现状要麻烦的多。 + + - 由于某些I/0设备可能会修改连续一致的内存,所以有时,指向连续一致内存的数据结构 + 的指针需要正确的使用volatile。网络适配器使用的环状缓存区正是这类情形的一个例 + 子,其中适配器用改变指针来表示哪些描述符已经处理过了。 + +对于大多代码,上述几种可以使用volatile的情况都不适用。所以,使用volatile是一种 +bug并且需要对这样的代码额外仔细检查。那些试图使用volatile的开发人员需要退一步想想 +他们真正想实现的是什么。 + +非常欢迎删除volatile变量的补丁 - 只要证明这些补丁完整的考虑了并发问题。 + +注释 +---- + +[1] http://lwn.net/Articles/233481/ +[2] http://lwn.net/Articles/233482/ + +致谢 +---- + +最初由Randy Dunlap推动并作初步研究 +由Jonathan Corbet撰写 +参考Satyam Sharma,Johannes Stezenbach,Jesper Juhl,Heikki Orsila, +H. Peter Anvin,Philipp Hahn和Stefan Richter的意见改善了本档。 |