summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/drivers/edac/edac.txt673
-rw-r--r--MAINTAINERS9
-rw-r--r--arch/i386/kernel/quirks.c9
-rw-r--r--drivers/Kconfig2
-rw-r--r--drivers/Makefile1
-rw-r--r--drivers/edac/Kconfig102
-rw-r--r--drivers/edac/Makefile18
-rw-r--r--drivers/edac/amd76x_edac.c2
-rw-r--r--drivers/edac/e752x_edac.c14
-rw-r--r--drivers/edac/e7xxx_edac.c2
-rw-r--r--drivers/edac/edac_mc.c2209
-rw-r--r--drivers/edac/edac_mc.h448
-rw-r--r--drivers/edac/i82860_edac.c2
-rw-r--r--drivers/edac/i82875p_edac.c2
-rw-r--r--drivers/edac/r82600_edac.c2
-rw-r--r--include/asm-i386/atomic.h12
-rw-r--r--include/asm-i386/edac.h18
-rw-r--r--include/asm-x86_64/atomic.h12
-rw-r--r--include/asm-x86_64/edac.h18
19 files changed, 3515 insertions, 40 deletions
diff --git a/Documentation/drivers/edac/edac.txt b/Documentation/drivers/edac/edac.txt
new file mode 100644
index 000000000000..d37191fe5681
--- /dev/null
+++ b/Documentation/drivers/edac/edac.txt
@@ -0,0 +1,673 @@
+
+
+EDAC - Error Detection And Correction
+
+Written by Doug Thompson <norsk5@xmission.com>
+7 Dec 2005
+
+
+EDAC was written by:
+ Thayne Harbaugh,
+ modified by Dave Peterson, Doug Thompson, et al,
+ from the bluesmoke.sourceforge.net project.
+
+
+============================================================================
+EDAC PURPOSE
+
+The 'edac' kernel module goal is to detect and report errors that occur
+within the computer system. In the initial release, memory Correctable Errors
+(CE) and Uncorrectable Errors (UE) are the primary errors being harvested.
+
+Detecting CE events, then harvesting those events and reporting them,
+CAN be a predictor of future UE events. With CE events, the system can
+continue to operate, but with less safety. Preventive maintainence and
+proactive part replacement of memory DIMMs exhibiting CEs can reduce
+the likelihood of the dreaded UE events and system 'panics'.
+
+
+In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices
+in order to determine if errors are occurring on data transfers.
+The presence of PCI Parity errors must be examined with a grain of salt.
+There are several addin adapters that do NOT follow the PCI specification
+with regards to Parity generation and reporting. The specification says
+the vendor should tie the parity status bits to 0 if they do not intend
+to generate parity. Some vendors do not do this, and thus the parity bit
+can "float" giving false positives.
+
+The PCI Parity EDAC device has the ability to "skip" known flakey
+cards during the parity scan. These are set by the parity "blacklist"
+interface in the sysfs for PCI Parity. (See the PCI section in the sysfs
+section below.) There is also a parity "whitelist" which is used as
+an explicit list of devices to scan, while the blacklist is a list
+of devices to skip.
+
+EDAC will have future error detectors that will be added or integrated
+into EDAC in the following list:
+
+ MCE Machine Check Exception
+ MCA Machine Check Architecture
+ NMI NMI notification of ECC errors
+ MSRs Machine Specific Register error cases
+ and other mechanisms.
+
+These errors are usually bus errors, ECC errors, thermal throttling
+and the like.
+
+
+============================================================================
+EDAC VERSIONING
+
+EDAC is composed of a "core" module (edac_mc.ko) and several Memory
+Controller (MC) driver modules. On a given system, the CORE
+is loaded and one MC driver will be loaded. Both the CORE and
+the MC driver have individual versions that reflect current release
+level of their respective modules. Thus, to "report" on what version
+a system is running, one must report both the CORE's and the
+MC driver's versions.
+
+
+LOADING
+
+If 'edac' was statically linked with the kernel then no loading is
+necessary. If 'edac' was built as modules then simply modprobe the
+'edac' pieces that you need. You should be able to modprobe
+hardware-specific modules and have the dependencies load the necessary core
+modules.
+
+Example:
+
+$> modprobe amd76x_edac
+
+loads both the amd76x_edac.ko memory controller module and the edac_mc.ko
+core module.
+
+
+============================================================================
+EDAC sysfs INTERFACE
+
+EDAC presents a 'sysfs' interface for control, reporting and attribute
+reporting purposes.
+
+EDAC lives in the /sys/devices/system/edac directory. Within this directory
+there currently reside 2 'edac' components:
+
+ mc memory controller(s) system
+ pci PCI status system
+
+
+============================================================================
+Memory Controller (mc) Model
+
+First a background on the memory controller's model abstracted in EDAC.
+Each mc device controls a set of DIMM memory modules. These modules are
+layed out in a Chip-Select Row (csrowX) and Channel table (chX). There can
+be multiple csrows and two channels.
+
+Memory controllers allow for several csrows, with 8 csrows being a typical value.
+Yet, the actual number of csrows depends on the electrical "loading"
+of a given motherboard, memory controller and DIMM characteristics.
+
+Dual channels allows for 128 bit data transfers to the CPU from memory.
+
+
+ Channel 0 Channel 1
+ ===================================
+ csrow0 | DIMM_A0 | DIMM_B0 |
+ csrow1 | DIMM_A0 | DIMM_B0 |
+ ===================================
+
+ ===================================
+ csrow2 | DIMM_A1 | DIMM_B1 |
+ csrow3 | DIMM_A1 | DIMM_B1 |
+ ===================================
+
+In the above example table there are 4 physical slots on the motherboard
+for memory DIMMs:
+
+ DIMM_A0
+ DIMM_B0
+ DIMM_A1
+ DIMM_B1
+
+Labels for these slots are usually silk screened on the motherboard. Slots
+labeled 'A' are channel 0 in this example. Slots labled 'B'
+are channel 1. Notice that there are two csrows possible on a
+physical DIMM. These csrows are allocated their csrow assignment
+based on the slot into which the memory DIMM is placed. Thus, when 1 DIMM
+is placed in each Channel, the csrows cross both DIMMs.
+
+Memory DIMMs come single or dual "ranked". A rank is a populated csrow.
+Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above
+will have 1 csrow, csrow0. csrow1 will be empty. On the other hand,
+when 2 dual ranked DIMMs are similiaryly placed, then both csrow0 and
+csrow1 will be populated. The pattern repeats itself for csrow2 and
+csrow3.
+
+The representation of the above is reflected in the directory tree
+in EDAC's sysfs interface. Starting in directory
+/sys/devices/system/edac/mc each memory controller will be represented
+by its own 'mcX' directory, where 'X" is the index of the MC.
+
+
+ ..../edac/mc/
+ |
+ |->mc0
+ |->mc1
+ |->mc2
+ ....
+
+Under each 'mcX' directory each 'csrowX' is again represented by a
+'csrowX', where 'X" is the csrow index:
+
+
+ .../mc/mc0/
+ |
+ |->csrow0
+ |->csrow2
+ |->csrow3
+ ....
+
+Notice that there is no csrow1, which indicates that csrow0 is
+composed of a single ranked DIMMs. This should also apply in both
+Channels, in order to have dual-channel mode be operational. Since
+both csrow2 and csrow3 are populated, this indicates a dual ranked
+set of DIMMs for channels 0 and 1.
+
+
+Within each of the 'mc','mcX' and 'csrowX' directories are several
+EDAC control and attribute files.
+
+
+============================================================================
+DIRECTORY 'mc'
+
+In directory 'mc' are EDAC system overall control and attribute files:
+
+
+Panic on UE control file:
+
+ 'panic_on_ue'
+
+ An uncorrectable error will cause a machine panic. This is usually
+ desirable. It is a bad idea to continue when an uncorrectable error
+ occurs - it is indeterminate what was uncorrected and the operating
+ system context might be so mangled that continuing will lead to further
+ corruption. If the kernel has MCE configured, then EDAC will never
+ notice the UE.
+
+ LOAD TIME: module/kernel parameter: panic_on_ue=[0|1]
+
+ RUN TIME: echo "1" >/sys/devices/system/edac/mc/panic_on_ue
+
+
+Log UE control file:
+
+ 'log_ue'
+
+ Generate kernel messages describing uncorrectable errors. These errors
+ are reported through the system message log system. UE statistics
+ will be accumulated even when UE logging is disabled.
+
+ LOAD TIME: module/kernel parameter: log_ue=[0|1]
+
+ RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ue
+
+
+Log CE control file:
+
+ 'log_ce'
+
+ Generate kernel messages describing correctable errors. These
+ errors are reported through the system message log system.
+ CE statistics will be accumulated even when CE logging is disabled.
+
+ LOAD TIME: module/kernel parameter: log_ce=[0|1]
+
+ RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ce
+
+
+Polling period control file:
+
+ 'poll_msec'
+
+ The time period, in milliseconds, for polling for error information.
+ Too small a value wastes resources. Too large a value might delay
+ necessary handling of errors and might loose valuable information for
+ locating the error. 1000 milliseconds (once each second) is about
+ right for most uses.
+
+ LOAD TIME: module/kernel parameter: poll_msec=[0|1]
+
+ RUN TIME: echo "1000" >/sys/devices/system/edac/mc/poll_msec
+
+
+Module Version read-only attribute file:
+
+ 'mc_version'
+
+ The EDAC CORE modules's version and compile date are shown here to
+ indicate what EDAC is running.
+
+
+
+============================================================================
+'mcX' DIRECTORIES
+
+
+In 'mcX' directories are EDAC control and attribute files for
+this 'X" instance of the memory controllers:
+
+
+Counter reset control file:
+
+ 'reset_counters'
+
+ This write-only control file will zero all the statistical counters
+ for UE and CE errors. Zeroing the counters will also reset the timer
+ indicating how long since the last counter zero. This is useful
+ for computing errors/time. Since the counters are always reset at
+ driver initialization time, no module/kernel parameter is available.
+
+ RUN TIME: echo "anything" >/sys/devices/system/edac/mc/mc0/counter_reset
+
+ This resets the counters on memory controller 0
+
+
+Seconds since last counter reset control file:
+
+ 'seconds_since_reset'
+
+ This attribute file displays how many seconds have elapsed since the
+ last counter reset. This can be used with the error counters to
+ measure error rates.
+
+
+
+DIMM capability attribute file:
+
+ 'edac_capability'
+
+ The EDAC (Error Detection and Correction) capabilities/modes of
+ the memory controller hardware.
+
+
+DIMM Current Capability attribute file:
+
+ 'edac_current_capability'
+
+ The EDAC capabilities available with the hardware
+ configuration. This may not be the same as "EDAC capability"
+ if the correct memory is not used. If a memory controller is
+ capable of EDAC, but DIMMs without check bits are in use, then
+ Parity, SECDED, S4ECD4ED capabilities will not be available
+ even though the memory controller might be capable of those
+ modes with the proper memory loaded.
+
+
+Memory Type supported on this controller attribute file:
+
+ 'supported_mem_type'
+
+ This attribute file displays the memory type, usually
+ buffered and unbuffered DIMMs.
+
+
+Memory Controller name attribute file:
+
+ 'mc_name'
+
+ This attribute file displays the type of memory controller
+ that is being utilized.
+
+
+Memory Controller Module name attribute file:
+
+ 'module_name'
+
+ This attribute file displays the memory controller module name,
+ version and date built. The name of the memory controller
+ hardware - some drivers work with multiple controllers and
+ this field shows which hardware is present.
+
+
+Total memory managed by this memory controller attribute file:
+
+ 'size_mb'
+
+ This attribute file displays, in count of megabytes, of memory
+ that this instance of memory controller manages.
+
+
+Total Uncorrectable Errors count attribute file:
+
+ 'ue_count'
+
+ This attribute file displays the total count of uncorrectable
+ errors that have occurred on this memory controller. If panic_on_ue
+ is set this counter will not have a chance to increment,
+ since EDAC will panic the system.
+
+
+Total UE count that had no information attribute fileY:
+
+ 'ue_noinfo_count'
+
+ This attribute file displays the number of UEs that
+ have occurred have occurred with no informations as to which DIMM
+ slot is having errors.
+
+
+Total Correctable Errors count attribute file:
+
+ 'ce_count'
+
+ This attribute file displays the total count of correctable
+ errors that have occurred on this memory controller. This
+ count is very important to examine. CEs provide early
+ indications that a DIMM is beginning to fail. This count
+ field should be monitored for non-zero values and report
+ such information to the system administrator.
+
+
+Total Correctable Errors count attribute file:
+
+ 'ce_noinfo_count'
+
+ This attribute file displays the number of CEs that
+ have occurred wherewith no informations as to which DIMM slot
+ is having errors. Memory is handicapped, but operational,
+ yet no information is available to indicate which slot
+ the failing memory is in. This count field should be also
+ be monitored for non-zero values.
+
+Device Symlink:
+
+ 'device'
+
+ Symlink to the memory controller device
+
+
+
+============================================================================
+'csrowX' DIRECTORIES
+
+In the 'csrowX' directories are EDAC control and attribute files for
+this 'X" instance of csrow:
+
+
+Total Uncorrectable Errors count attribute file:
+
+ 'ue_count'
+
+ This attribute file displays the total count of uncorrectable
+ errors that have occurred on this csrow. If panic_on_ue is set
+ this counter will not have a chance to increment, since EDAC
+ will panic the system.
+
+
+Total Correctable Errors count attribute file:
+
+ 'ce_count'
+
+ This attribute file displays the total count of correctable
+ errors that have occurred on this csrow. This
+ count is very important to examine. CEs provide early
+ indications that a DIMM is beginning to fail. This count
+ field should be monitored for non-zero values and report
+ such information to the system administrator.
+
+
+Total memory managed by this csrow attribute file:
+
+ 'size_mb'
+
+ This attribute file displays, in count of megabytes, of memory
+ that this csrow contatins.
+
+
+Memory Type attribute file:
+
+ 'mem_type'
+
+ This attribute file will display what type of memory is currently
+ on this csrow. Normally, either buffered or unbuffered memory.
+
+
+EDAC Mode of operation attribute file:
+
+ 'edac_mode'
+
+ This attribute file will display what type of Error detection
+ and correction is being utilized.
+
+
+Device type attribute file:
+
+ 'dev_type'
+
+ This attribute file will display what type of DIMM device is
+ being utilized. Example: x4
+
+
+Channel 0 CE Count attribute file:
+
+ 'ch0_ce_count'
+
+ This attribute file will display the count of CEs on this
+ DIMM located in channel 0.
+
+
+Channel 0 UE Count attribute file:
+
+ 'ch0_ue_count'
+
+ This attribute file will display the count of UEs on this
+ DIMM located in channel 0.
+
+
+Channel 0 DIMM Label control file:
+
+ 'ch0_dimm_label'
+
+ This control file allows this DIMM to have a label assigned
+ to it. With this label in the module, when errors occur
+ the output can provide the DIMM label in the system log.
+ This becomes vital for panic events to isolate the
+ cause of the UE event.
+
+ DIMM Labels must be assigned after booting, with information
+ that correctly identifies the physical slot with its
+ silk screen label. This information is currently very
+ motherboard specific and determination of this information
+ must occur in userland at this time.
+
+
+Channel 1 CE Count attribute file:
+
+ 'ch1_ce_count'
+
+ This attribute file will display the count of CEs on this
+ DIMM located in channel 1.
+
+
+Channel 1 UE Count attribute file:
+
+ 'ch1_ue_count'
+
+ This attribute file will display the count of UEs on this
+ DIMM located in channel 0.
+
+
+Channel 1 DIMM Label control file:
+
+ 'ch1_dimm_label'
+
+ This control file allows this DIMM to have a label assigned
+ to it. With this label in the module, when errors occur
+ the output can provide the DIMM label in the system log.
+ This becomes vital for panic events to isolate the
+ cause of the UE event.
+
+ DIMM Labels must be assigned after booting, with information
+ that correctly identifies the physical slot with its
+ silk screen label. This information is currently very
+ motherboard specific and determination of this information
+ must occur in userland at this time.
+
+
+============================================================================
+SYSTEM LOGGING
+
+If logging for UEs and CEs are enabled then system logs will have
+error notices indicating errors that have been detected:
+
+MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0,
+channel 1 "DIMM_B1": amd76x_edac
+
+MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
+channel 1 "DIMM_B1": amd76x_edac
+
+
+The structure of the message is:
+ the memory controller (MC0)
+ Error type (CE)
+ memory page (0x283)
+ offset in the page (0xce0)
+ the byte granularity (grain 8)
+ or resolution of the error
+ the error syndrome (0xb741)
+ memory row (row 0)
+ memory channel (channel 1)
+ DIMM label, if set prior (DIMM B1
+ and then an optional, driver-specific message that may
+ have additional information.
+
+Both UEs and CEs with no info will lack all but memory controller,
+error type, a notice of "no info" and then an optional,
+driver-specific error message.
+
+
+
+============================================================================
+PCI Bus Parity Detection
+
+
+On Header Type 00 devices the primary status is looked at
+for any parity error regardless of whether Parity is enabled on the
+device. (The spec indicates parity is generated in some cases).
+On Header Type 01 bridges, the secondary status register is also
+looked at to see if parity ocurred on the bus on the other side of
+the bridge.
+
+
+SYSFS CONFIGURATION
+
+Under /sys/devices/system/edac/pci are control and attribute files as follows:
+
+
+Enable/Disable PCI Parity checking control file:
+
+ 'check_pci_parity'
+
+
+ This control file enables or disables the PCI Bus Parity scanning
+ operation. Writing a 1 to this file enables the scanning. Writing
+ a 0 to this file disables the scanning.
+
+ Enable:
+ echo "1" >/sys/devices/system/edac/pci/check_pci_parity
+
+ Disable:
+ echo "0" >/sys/devices/system/edac/pci/check_pci_parity
+
+
+
+Panic on PCI PARITY Error:
+
+ 'panic_on_pci_parity'
+
+
+ This control files enables or disables panic'ing when a parity
+ error has been detected.
+
+
+ module/kernel parameter: panic_on_pci_parity=[0|1]
+
+ Enable:
+ echo "1" >/sys/devices/system/edac/pci/panic_on_pci_parity
+
+ Disable:
+ echo "0" >/sys/devices/system/edac/pci/panic_on_pci_parity
+
+
+Parity Count:
+
+ 'pci_parity_count'
+
+ This attribute file will display the number of parity errors that
+ have been detected.
+
+
+
+PCI Device Whitelist:
+
+ 'pci_parity_whitelist'
+
+ This control file allows for an explicit list of PCI devices to be
+ scanned for parity errors. Only devices found on this list will
+ be examined. The list is a line of hexadecimel VENDOR and DEVICE
+ ID tuples:
+
+ 1022:7450,1434:16a6
+
+ One or more can be inserted, seperated by a comma.
+
+ To write the above list doing the following as one command line:
+
+ echo "1022:7450,1434:16a6"
+ > /sys/devices/system/edac/pci/pci_parity_whitelist
+
+
+
+ To display what the whitelist is, simply 'cat' the same file.
+
+
+PCI Device Blacklist:
+
+ 'pci_parity_blacklist'
+
+ This control file allows for a list of PCI devices to be
+ skipped for scanning.
+ The list is a line of hexadecimel VENDOR and DEVICE ID tuples:
+
+ 1022:7450,1434:16a6
+
+ One or more can be inserted, seperated by a comma.
+
+ To write the above list doing the following as one command line:
+
+ echo "1022:7450,1434:16a6"
+ > /sys/devices/system/edac/pci/pci_parity_blacklist
+
+
+ To display what the whitelist current contatins,
+ simply 'cat' the same file.
+
+=======================================================================
+
+PCI Vendor and Devices IDs can be obtained with the lspci command. Using
+the -n option lspci will display the vendor and device IDs. The system
+adminstrator will have to determine which devices should be scanned or
+skipped.
+
+
+
+The two lists (white and black) are prioritized. blacklist is the lower
+priority and will NOT be utilized when a whitelist has been set.
+Turn OFF a whitelist by an empty echo command:
+
+ echo > /sys/devices/system/edac/pci/pci_parity_whitelist
+
+and any previous blacklist will be utililzed.
+
diff --git a/MAINTAINERS b/MAINTAINERS
index e6dbb21a8e5b..3f8a90ac47d7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -867,6 +867,15 @@ L: ebtables-devel@lists.sourceforge.net
W: http://ebtables.sourceforge.net/
S: Maintained
+EDAC-CORE
+P: Doug Thompson
+M: norsk5@xmission.com, dthompson@linuxnetworx.com
+P: Dave Peterson
+M: dsp@llnl.gov, dave_peterson@pobox.com
+L: bluesmoke-devel@lists.sourceforge.net
+W: bluesmoke.sourceforge.net
+S: Maintained
+
EEPRO100 NETWORK DRIVER
P: Andrey V. Savochkin
M: saw@saw.sw.com.sg
diff --git a/arch/i386/kernel/quirks.c b/arch/i386/kernel/quirks.c
index aaf89cb2bc51..87ccdac84928 100644
--- a/arch/i386/kernel/quirks.c
+++ b/arch/i386/kernel/quirks.c
@@ -25,8 +25,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
/* enable access to config space*/
pci_read_config_byte(dev, 0xf4, &config);
- config |= 0x2;
- pci_write_config_byte(dev, 0xf4, config);
+ pci_write_config_byte(dev, 0xf4, config|0x2);
/* read xTPR register */
raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
@@ -42,9 +41,9 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
#endif
}
- config &= ~0x2;
- /* disable access to config space*/
- pci_write_config_byte(dev, 0xf4, config);
+ /* put back the original value for config space*/
+ if (!(config & 0x2))
+ pci_write_config_byte(dev, 0xf4, config);
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7320_MCH, quirk_intel_irqbalance);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quirk_intel_irqbalance);
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 283c089537bc..bddf431bbb72 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -68,4 +68,6 @@ source "drivers/infiniband/Kconfig"
source "drivers/sn/Kconfig"
+source "drivers/edac/Kconfig"
+
endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 7c45050ecd03..619dd964c51c 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_PHONE) += telephony/
obj-$(CONFIG_MD) += md/
obj-$(CONFIG_BT) += bluetooth/
obj-$(CONFIG_ISDN) += isdn/
+obj-$(CONFIG_EDAC) += edac/
obj-$(CONFIG_MCA) += mca/
obj-$(CONFIG_EISA) += eisa/
obj-$(CONFIG_CPU_FREQ) += cpufreq/
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
new file mode 100644
index 000000000000..4819e7fc00dd
--- /dev/null
+++ b/drivers/edac/Kconfig
@@ -0,0 +1,102 @@
+#
+# EDAC Kconfig
+# Copyright (c) 2003 Linux Networx
+# Licensed and distributed under the GPL
+#
+# $Id: Kconfig,v 1.4.2.7 2005/07/08 22:05:38 dsp_llnl Exp $
+#
+
+menu 'EDAC - error detection and reporting (RAS)'
+
+config EDAC
+ tristate "EDAC core system error reporting"
+ depends on X86
+ default y
+ help
+ EDAC is designed to report errors in the core system.
+ These are low-level errors that are reported in the CPU or
+ supporting chipset: memory errors, cache errors, PCI errors,
+ thermal throttling, etc.. If unsure, select 'Y'.
+
+
+comment "Reporting subsystems"
+ depends on EDAC
+
+config EDAC_DEBUG
+ bool "Debugging"
+ depends on EDAC
+ help
+ This turns on debugging information for the entire EDAC
+ sub-system. You can insert module with "debug_level=x", current
+ there're four debug levels (x=0,1,2,3 from low to high).
+ Usually you should select 'N'.
+
+config EDAC_MM_EDAC
+ tristate "Main Memory EDAC (Error Detection And Correction) reporting"
+ depends on EDAC
+ default y
+ help
+ Some systems are able to detect and correct errors in main
+ memory. EDAC can report statistics on memory error
+ detection and correction (EDAC - or commonly referred to ECC
+ errors). EDAC will also try to decode where these errors
+ occurred so that a particular failing memory module can be
+ replaced. If unsure, select 'Y'.
+
+
+config EDAC_AMD76X
+ tristate "AMD 76x (760, 762, 768)"
+ depends on EDAC_MM_EDAC && PCI
+ help
+ Support for error detection and correction on the AMD 76x
+ series of chipsets used with the Athlon processor.
+
+config EDAC_E7XXX
+ tristate "Intel e7xxx (e7205, e7500, e7501, e7505)"
+ depends on EDAC_MM_EDAC && PCI
+ help
+ Support for error detection and correction on the Intel
+ E7205, E7500, E7501 and E7505 server chipsets.
+
+config EDAC_E752X
+ tristate "Intel e752x (e7520, e7525, e7320)"
+ depends on EDAC_MM_EDAC && PCI
+ help
+ Support for error detection and correction on the Intel
+ E7520, E7525, E7320 server chipsets.
+
+config EDAC_I82875P
+ tristate "Intel 82875p (D82875P, E7210)"
+ depends on EDAC_MM_EDAC && PCI
+ help
+ Support for error detection and correction on the Intel
+ DP82785P and E7210 server chipsets.
+
+config EDAC_I82860
+ tristate "Intel 82860"
+ depends on EDAC_MM_EDAC && PCI
+ help
+ Support for error detection and correction on the Intel
+ 82860 chipset.
+
+config EDAC_R82600
+ tristate "Radisys 82600 embedded chipset"
+ depends on EDAC_MM_EDAC
+ help
+ Support for error detection and correction on the Radisys
+ 82600 embedded chipset.
+
+choice
+ prompt "Error detecting method"
+ depends on EDAC
+ default EDAC_POLL
+
+config EDAC_POLL
+ bool "Poll for errors"
+ depends on EDAC
+ help
+ Poll the chipset periodically to detect errors.
+
+endchoice
+
+endmenu
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
new file mode 100644
index 000000000000..93137fdab4b3
--- /dev/null
+++ b/drivers/edac/Makefile
@@ -0,0 +1,18 @@
+#
+# Makefile for the Linux kernel EDAC drivers.
+#
+# Copyright 02 Jul 2003, Linux Networx (http://lnxi.com)
+# This file may be distributed under the terms of the
+# GNU General Public License.
+#
+# $Id: Makefile,v 1.4.2.3 2005/07/08 22:05:38 dsp_llnl Exp $
+
+
+obj-$(CONFIG_EDAC_MM_EDAC) += edac_mc.o
+obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
+obj-$(CONFIG_EDAC_E7XXX) += e7xxx_edac.o
+obj-$(CONFIG_EDAC_E752X) += e752x_edac.o
+obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
+obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
+obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
+
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 8e2b1295e70c..2fcc8120b53c 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -338,7 +338,7 @@ static struct pci_driver amd76x_driver = {
.id_table = amd76x_pci_tbl,
};
-int __init amd76x_init(void)
+static int __init amd76x_init(void)
{
return pci_register_driver(&amd76x_driver);
}
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 959f584f5687..770a5a633079 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -13,7 +13,7 @@
* Wang Zhenyu at intel.com
* Dave Jiang at mvista.com
*
- * $Id: bluesmoke_e752x.c,v 1.5.2.11 2005/10/05 00:43:44 dsp_llnl Exp $
+ * $Id: edac_e752x.c,v 1.5.2.11 2005/10/05 00:43:44 dsp_llnl Exp $
*
*/
@@ -376,14 +376,14 @@ static inline void process_threshold_ce(struct mem_ctl_info *mci, u16 error,
mci->mc_idx);
}
-char *global_message[11] = {
+static char *global_message[11] = {
"PCI Express C1", "PCI Express C", "PCI Express B1",
"PCI Express B", "PCI Express A1", "PCI Express A",
"DMA Controler", "HUB Interface", "System Bus",
"DRAM Controler", "Internal Buffer"
};
-char *fatal_message[2] = { "Non-Fatal ", "Fatal " };
+static char *fatal_message[2] = { "Non-Fatal ", "Fatal " };
static void do_global_error(int fatal, u32 errors)
{
@@ -405,7 +405,7 @@ static inline void global_error(int fatal, u32 errors, int *error_found,
do_global_error(fatal, errors);
}
-char *hub_message[7] = {
+static char *hub_message[7] = {
"HI Address or Command Parity", "HI Illegal Access",
"HI Internal Parity", "Out of Range Access",
"HI Data Parity", "Enhanced Config Access",
@@ -432,7 +432,7 @@ static inline void hub_error(int fatal, u8 errors, int *error_found,
do_hub_error(fatal, errors);
}
-char *membuf_message[4] = {
+static char *membuf_message[4] = {
"Internal PMWB to DRAM parity",
"Internal PMWB to System Bus Parity",
"Internal System Bus or IO to PMWB Parity",
@@ -458,6 +458,7 @@ static inline void membuf_error(u8 errors, int *error_found, int handle_error)
do_membuf_error(errors);
}
+#if 0
char *sysbus_message[10] = {
"Addr or Request Parity",
"Data Strobe Glitch",
@@ -469,6 +470,7 @@ char *sysbus_message[10] = {
"Memory Parity",
"IO Subsystem Parity"
};
+#endif /* 0 */
static void do_sysbus_error(int fatal, u32 errors)
{
@@ -1044,7 +1046,7 @@ static struct pci_driver e752x_driver = {
};
-int __init e752x_init(void)
+static int __init e752x_init(void)
{
int pci_rc;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 066be43f5ece..d5e320dfc66f 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -537,7 +537,7 @@ static struct pci_driver e7xxx_driver = {
};
-int __init e7xxx_init(void)
+static int __init e7xxx_init(void)
{
return pci_register_driver(&e7xxx_driver);
}
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
new file mode 100644
index 000000000000..4be9bd0a1267
--- /dev/null
+++ b/drivers/edac/edac_mc.c
@@ -0,0 +1,2209 @@
+/*
+ * edac_mc kernel module
+ * (C) 2005 Linux Networx (http://lnxi.com)
+ * This file may be distributed under the terms of the
+ * GNU General Public License.
+ *
+ * Written by Thayne Harbaugh
+ * Based on work by Dan Hollis <goemon at anime dot net> and others.
+ * http://www.anime.net/~goemon/linux-ecc/
+ *
+ * Modified by Dave Peterson and Doug Thompson
+ *
+ */
+
+
+#include <linux/config.h>
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/smp.h>
+#include <linux/init.h>
+#include <linux/sysctl.h>
+#include <linux/highmem.h>
+#include <linux/timer.h>
+#include <linux/slab.h>
+#include <linux/jiffies.h>
+#include <linux/spinlock.h>
+#include <linux/list.h>
+#include <linux/sysdev.h>
+#include <linux/ctype.h>
+
+#include <asm/uaccess.h>
+#include <asm/page.h>
+#include <asm/edac.h>
+
+#include "edac_mc.h"
+
+#define EDAC_MC_VERSION "edac_mc Ver: 2.0.0 " __DATE__
+
+#ifdef CONFIG_EDAC_DEBUG
+/* Values of 0 to 4 will generate output */
+int edac_debug_level = 1;
+EXPORT_SYMBOL(edac_debug_level);
+#endif
+
+/* EDAC Controls, setable by module parameter, and sysfs */
+static int log_ue = 1;
+static int log_ce = 1;
+static int panic_on_ue = 1;
+static int poll_msec = 1000;
+
+static int check_pci_parity = 0; /* default YES check PCI parity */
+static int panic_on_pci_parity; /* default no panic on PCI Parity */
+static atomic_t pci_parity_count = ATOMIC_INIT(0);
+
+/* lock to memory controller's control array */
+static DECLARE_MUTEX(mem_ctls_mutex);
+static struct list_head mc_devices = LIST_HEAD_INIT(mc_devices);
+
+/* Structure of the whitelist and blacklist arrays */
+struct edac_pci_device_list {
+ unsigned int vendor; /* Vendor ID */
+ unsigned int device; /* Deviice ID */
+};
+
+
+#define MAX_LISTED_PCI_DEVICES 32
+
+/* List of PCI devices (vendor-id:device-id) that should be skipped */
+static struct edac_pci_device_list pci_blacklist[MAX_LISTED_PCI_DEVICES];
+static int pci_blacklist_count;
+
+/* List of PCI devices (vendor-id:device-id) that should be scanned */
+static struct edac_pci_device_list pci_whitelist[MAX_LISTED_PCI_DEVICES];
+static int pci_whitelist_count ;
+
+/* START sysfs data and methods */
+
+static const char *mem_types[] = {
+ [MEM_EMPTY] = "Empty",
+ [MEM_RESERVED] = "Reserved",
+ [MEM_UNKNOWN] = "Unknown",
+ [MEM_FPM] = "FPM",
+ [MEM_EDO] = "EDO",
+ [MEM_BEDO] = "BEDO",
+ [MEM_SDR] = "Unbuffered-SDR",
+ [MEM_RDR] = "Registered-SDR",
+ [MEM_DDR] = "Unbuffered-DDR",
+ [MEM_RDDR] = "Registered-DDR",
+ [MEM_RMBS] = "RMBS"
+};
+
+static const char *dev_types[] = {
+ [DEV_UNKNOWN] = "Unknown",
+ [DEV_X1] = "x1",
+ [DEV_X2] = "x2",
+ [DEV_X4] = "x4",
+ [DEV_X8] = "x8",
+ [DEV_X16] = "x16",
+ [DEV_X32] = "x32",
+ [DEV_X64] = "x64"
+};
+
+static const char *edac_caps[] = {
+ [EDAC_UNKNOWN] = "Unknown",
+ [EDAC_NONE] = "None",
+ [EDAC_RESERVED] = "Reserved",
+ [EDAC_PARITY] = "PARITY",
+ [EDAC_EC] = "EC",
+ [EDAC_SECDED] = "SECDED",
+ [EDAC_S2ECD2ED] = "S2ECD2ED",
+ [EDAC_S4ECD4ED] = "S4ECD4ED",
+ [EDAC_S8ECD8ED] = "S8ECD8ED",
+ [EDAC_S16ECD16ED] = "S16ECD16ED"
+};
+
+
+/* sysfs object: /sys/devices/system/edac */
+static struct sysdev_class edac_class = {
+ set_kset_name("edac"),
+};
+
+/* sysfs objects:
+ * /sys/devices/system/edac/mc
+ * /sys/devices/system/edac/pci
+ */
+static struct kobject edac_memctrl_kobj;
+static struct kobject edac_pci_kobj;
+
+/*
+ * /sys/devices/system/edac/mc;
+ * data structures and methods
+ */
+static ssize_t memctrl_string_show(void *ptr, char *buffer)
+{
+ char *value = (char*) ptr;
+ return sprintf(buffer, "%s\n", value);
+}
+
+static ssize_t memctrl_int_show(void *ptr, char *buffer)
+{
+ int *value = (int*) ptr;
+ return sprintf(buffer, "%d\n", *value);
+}
+
+static ssize_t memctrl_int_store(void *ptr, const char *buffer, size_t count)
+{
+ int *value = (int*) ptr;
+
+ if (isdigit(*buffer))
+ *value = simple_strtoul(buffer, NULL, 0);
+
+ return count;
+}
+
+struct memctrl_dev_attribute {
+ struct attribute attr;
+ void *value;
+ ssize_t (*show)(void *,char *);
+ ssize_t (*store)(void *, const char *, size_t);
+};
+
+/* Set of show/store abstract level functions for memory control object */
+static ssize_t
+memctrl_dev_show(struct kobject *kobj, struct attribute *attr, char *buffer)
+{
+ struct memctrl_dev_attribute *memctrl_dev;
+ memctrl_dev = (struct memctrl_dev_attribute*)attr;
+
+ if (memctrl_dev->show)
+ return memctrl_dev->show(memctrl_dev->value, buffer);
+ return -EIO;
+}
+
+static ssize_t
+memctrl_dev_store(struct kobject *kobj, struct attribute *attr,
+ const char *buffer, size_t count)
+{
+ struct memctrl_dev_attribute *memctrl_dev;
+ memctrl_dev = (struct memctrl_dev_attribute*)attr;
+
+ if (memctrl_dev->store)
+ return memctrl_dev->store(memctrl_dev->value, buffer, count);
+ return -EIO;
+}
+
+static struct sysfs_ops memctrlfs_ops = {
+ .show = memctrl_dev_show,
+ .store = memctrl_dev_store
+};
+
+#define MEMCTRL_ATTR(_name,_mode,_show,_store) \
+struct memctrl_dev_attribute attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .value = &_name, \
+ .show = _show, \
+ .store = _store, \
+};
+
+#define MEMCTRL_STRING_ATTR(_name,_data,_mode,_show,_store) \
+struct memctrl_dev_attribute attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .value = _data, \
+ .show = _show, \
+ .store = _store, \
+};
+
+/* cwrow<id> attribute f*/
+MEMCTRL_STRING_ATTR(mc_version,EDAC_MC_VERSION,S_IRUGO,memctrl_string_show,NULL);
+
+/* csrow<id> control files */
+MEMCTRL_ATTR(panic_on_ue,S_IRUGO|S_IWUSR,memctrl_int_show,memctrl_int_store);
+MEMCTRL_ATTR(log_ue,S_IRUGO|S_IWUSR,memctrl_int_show,memctrl_int_store);
+MEMCTRL_ATTR(log_ce,S_IRUGO|S_IWUSR,memctrl_int_show,memctrl_int_store);
+MEMCTRL_ATTR(poll_msec,S_IRUGO|S_IWUSR,memctrl_int_show,memctrl_int_store);
+
+
+/* Base Attributes of the memory ECC object */
+static struct memctrl_dev_attribute *memctrl_attr[] = {
+ &attr_panic_on_ue,
+ &attr_log_ue,
+ &attr_log_ce,
+ &attr_poll_msec,
+ &attr_mc_version,
+ NULL,
+};
+
+/* Main MC kobject release() function */
+static void edac_memctrl_master_release(struct kobject *kobj)
+{
+ debugf1("EDAC MC: " __FILE__ ": %s()\n", __func__);
+}
+
+static struct kobj_type ktype_memctrl = {
+ .release = edac_memctrl_master_release,
+ .sysfs_ops = &memctrlfs_ops,
+ .default_attrs = (struct attribute **) memctrl_attr,
+};
+
+
+/* Initialize the main sysfs entries for edac:
+ * /sys/devices/system/edac
+ *
+ * and children
+ *
+ * Return: 0 SUCCESS
+ * !0 FAILURE
+ */
+static int edac_sysfs_memctrl_setup(void)
+{
+ int err=0;
+
+ debugf1("MC: " __FILE__ ": %s()\n", __func__);
+
+ /* create the /sys/devices/system/edac directory */
+ err = sysdev_class_register(&edac_class);
+ if (!err) {
+ /* Init the MC's kobject */
+ memset(&edac_memctrl_kobj, 0, sizeof (edac_memctrl_kobj));
+ kobject_init(&edac_memctrl_kobj);
+
+ edac_memctrl_kobj.parent = &edac_class.kset.kobj;
+ edac_memctrl_kobj.ktype = &ktype_memctrl;
+
+ /* generate sysfs "..../edac/mc" */
+ err = kobject_set_name(&edac_memctrl_kobj,"mc");
+ if (!err) {
+ /* FIXME: maybe new sysdev_create_subdir() */
+ err = kobject_register(&edac_memctrl_kobj);
+ if (err) {
+ debugf1("Failed to register '.../edac/mc'\n");
+ } else {
+ debugf1("Registered '.../edac/mc' kobject\n");
+ }
+ }
+ } else {
+ debugf1(KERN_WARNING "__FILE__ %s() error=%d\n", __func__,err);
+ }
+
+ return err;
+}
+
+/*
+ * MC teardown:
+ * the '..../edac/mc' kobject followed by '..../edac' itself
+ */
+static void edac_sysfs_memctrl_teardown(void)
+{
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+
+ /* Unregister the MC's kobject */
+ kobject_unregister(&edac_memctrl_kobj);
+
+ /* release the master edac mc kobject */
+ kobject_put(&edac_memctrl_kobj);
+
+ /* Unregister the 'edac' object */
+ sysdev_class_unregister(&edac_class);
+}
+
+/*
+ * /sys/devices/system/edac/pci;
+ * data structures and methods
+ */
+
+struct list_control {
+ struct edac_pci_device_list *list;
+ int *count;
+};
+
+/* Output the list as: vendor_id:device:id<,vendor_id:device_id> */
+static ssize_t edac_pci_list_string_show(void *ptr, char *buffer)
+{
+ struct list_control *listctl;
+ struct edac_pci_device_list *list;
+ char *p = buffer;
+ int len=0;
+ int i;
+
+ listctl = ptr;
+ list = listctl->list;
+
+ for (i = 0; i < *(listctl->count); i++, list++ ) {
+ if (len > 0)
+ len += snprintf(p + len, (PAGE_SIZE-len), ",");
+
+ len += snprintf(p + len,
+ (PAGE_SIZE-len),
+ "%x:%x",
+ list->vendor,list->device);
+ }
+
+ len += snprintf(p + len,(PAGE_SIZE-len), "\n");
+
+ return (ssize_t) len;
+}
+
+/**
+ *
+ * Scan string from **s to **e looking for one 'vendor:device' tuple
+ * where each field is a hex value
+ *
+ * return 0 if an entry is NOT found
+ * return 1 if an entry is found
+ * fill in *vendor_id and *device_id with values found
+ *
+ * In both cases, make sure *s has been moved forward toward *e
+ */
+static int parse_one_device(const char **s,const char **e,
+ unsigned int *vendor_id, unsigned int *device_id)
+{
+ const char *runner, *p;
+
+ /* if null byte, we are done */
+ if (!**s) {
+ (*s)++; /* keep *s moving */
+ return 0;
+ }
+
+ /* skip over newlines & whitespace */
+ if ((**s == '\n') || isspace(**s)) {
+ (*s)++;
+ return 0;
+ }
+
+ if (!isxdigit(**s)) {
+ (*s)++;
+ return 0;
+ }
+
+ /* parse vendor_id */
+ runner = *s;
+ while (runner < *e) {
+ /* scan for vendor:device delimiter */
+ if (*runner == ':') {
+ *vendor_id = simple_strtol((char*) *s, (char**) &p, 16);
+ runner = p + 1;
+ break;
+ }
+ runner++;
+ }
+
+ if (!isxdigit(*runner)) {
+ *s = ++runner;
+ return 0;
+ }
+
+ /* parse device_id */
+ if (runner < *e) {
+ *device_id = simple_strtol((char*)runner, (char**)&p, 16);
+ runner = p;
+ }
+
+ *s = runner;
+
+ return 1;
+}
+
+static ssize_t edac_pci_list_string_store(void *ptr, const char *buffer,
+ size_t count)
+{
+ struct list_control *listctl;
+ struct edac_pci_device_list *list;
+ unsigned int vendor_id, device_id;
+ const char *s, *e;
+ int *index;
+
+ s = (char*)buffer;
+ e = s + count;
+
+ listctl = ptr;
+ list = listctl->list;
+ index = listctl->count;
+
+ *index = 0;
+ while (*index < MAX_LISTED_PCI_DEVICES) {
+
+ if (parse_one_device(&s,&e,&vendor_id,&device_id)) {
+ list[ *index ].vendor = vendor_id;
+ list[ *index ].device = device_id;
+ (*index)++;
+ }
+
+ /* check for all data consume */
+ if (s >= e)
+ break;
+ }
+
+ return count;
+}
+
+static ssize_t edac_pci_int_show(void *ptr, char *buffer)
+{
+ int *value = ptr;
+ return sprintf(buffer,"%d\n",*value);
+}
+
+static ssize_t edac_pci_int_store(void *ptr, const char *buffer, size_t count)
+{
+ int *value = ptr;
+
+ if (isdigit(*buffer))
+ *value = simple_strtoul(buffer,NULL,0);
+
+ return count;
+}
+
+struct edac_pci_dev_attribute {
+ struct attribute attr;
+ void *value;
+ ssize_t (*show)(void *,char *);
+ ssize_t (*store)(void *, const char *,size_t);
+};
+
+/* Set of show/store abstract level functions for PCI Parity object */
+static ssize_t edac_pci_dev_show(struct kobject *kobj, struct attribute *attr,
+ char *buffer)
+{
+ struct edac_pci_dev_attribute *edac_pci_dev;
+ edac_pci_dev= (struct edac_pci_dev_attribute*)attr;
+
+ if (edac_pci_dev->show)
+ return edac_pci_dev->show(edac_pci_dev->value, buffer);
+ return -EIO;
+}
+
+static ssize_t edac_pci_dev_store(struct kobject *kobj, struct attribute *attr,
+ const char *buffer, size_t count)
+{
+ struct edac_pci_dev_attribute *edac_pci_dev;
+ edac_pci_dev= (struct edac_pci_dev_attribute*)attr;
+
+ if (edac_pci_dev->show)
+ return edac_pci_dev->store(edac_pci_dev->value, buffer, count);
+ return -EIO;
+}
+
+static struct sysfs_ops edac_pci_sysfs_ops = {
+ .show = edac_pci_dev_show,
+ .store = edac_pci_dev_store
+};
+
+
+#define EDAC_PCI_ATTR(_name,_mode,_show,_store) \
+struct edac_pci_dev_attribute edac_pci_attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .value = &_name, \
+ .show = _show, \
+ .store = _store, \
+};
+
+#define EDAC_PCI_STRING_ATTR(_name,_data,_mode,_show,_store) \
+struct edac_pci_dev_attribute edac_pci_attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .value = _data, \
+ .show = _show, \
+ .store = _store, \
+};
+
+static struct list_control pci_whitelist_control = {
+ .list = pci_whitelist,
+ .count = &pci_whitelist_count
+};
+
+static struct list_control pci_blacklist_control = {
+ .list = pci_blacklist,
+ .count = &pci_blacklist_count
+};
+
+/* whitelist attribute */
+EDAC_PCI_STRING_ATTR(pci_parity_whitelist,
+ &pci_whitelist_control,
+ S_IRUGO|S_IWUSR,
+ edac_pci_list_string_show,
+ edac_pci_list_string_store);
+
+EDAC_PCI_STRING_ATTR(pci_parity_blacklist,
+ &pci_blacklist_control,
+ S_IRUGO|S_IWUSR,
+ edac_pci_list_string_show,
+ edac_pci_list_string_store);
+
+/* PCI Parity control files */
+EDAC_PCI_ATTR(check_pci_parity,S_IRUGO|S_IWUSR,edac_pci_int_show,edac_pci_int_store);
+EDAC_PCI_ATTR(panic_on_pci_parity,S_IRUGO|S_IWUSR,edac_pci_int_show,edac_pci_int_store);
+EDAC_PCI_ATTR(pci_parity_count,S_IRUGO,edac_pci_int_show,NULL);
+
+/* Base Attributes of the memory ECC object */
+static struct edac_pci_dev_attribute *edac_pci_attr[] = {
+ &edac_pci_attr_check_pci_parity,
+ &edac_pci_attr_panic_on_pci_parity,
+ &edac_pci_attr_pci_parity_count,
+ &edac_pci_attr_pci_parity_whitelist,
+ &edac_pci_attr_pci_parity_blacklist,
+ NULL,
+};
+
+/* No memory to release */
+static void edac_pci_release(struct kobject *kobj)
+{
+ debugf1("EDAC PCI: " __FILE__ ": %s()\n", __func__);
+}
+
+static struct kobj_type ktype_edac_pci = {
+ .release = edac_pci_release,
+ .sysfs_ops = &edac_pci_sysfs_ops,
+ .default_attrs = (struct attribute **) edac_pci_attr,
+};
+
+/**
+ * edac_sysfs_pci_setup()
+ *
+ */
+static int edac_sysfs_pci_setup(void)
+{
+ int err;
+
+ debugf1("MC: " __FILE__ ": %s()\n", __func__);
+
+ memset(&edac_pci_kobj, 0, sizeof(edac_pci_kobj));
+
+ kobject_init(&edac_pci_kobj);
+ edac_pci_kobj.parent = &edac_class.kset.kobj;
+ edac_pci_kobj.ktype = &ktype_edac_pci;
+
+ err = kobject_set_name(&edac_pci_kobj, "pci");
+ if (!err) {
+ /* Instanstiate the csrow object */
+ /* FIXME: maybe new sysdev_create_subdir() */
+ err = kobject_register(&edac_pci_kobj);
+ if (err)
+ debugf1("Failed to register '.../edac/pci'\n");
+ else
+ debugf1("Registered '.../edac/pci' kobject\n");
+ }
+ return err;
+}
+
+
+static void edac_sysfs_pci_teardown(void)
+{
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+
+ kobject_unregister(&edac_pci_kobj);
+ kobject_put(&edac_pci_kobj);
+}
+
+/* EDAC sysfs CSROW data structures and methods */
+
+/* Set of more detailed csrow<id> attribute show/store functions */
+static ssize_t csrow_ch0_dimm_label_show(struct csrow_info *csrow, char *data)
+{
+ ssize_t size = 0;
+
+ if (csrow->nr_channels > 0) {
+ size = snprintf(data, EDAC_MC_LABEL_LEN,"%s\n",
+ csrow->channels[0].label);
+ }
+ return size;
+}
+
+static ssize_t csrow_ch1_dimm_label_show(struct csrow_info *csrow, char *data)
+{
+ ssize_t size = 0;
+
+ if (csrow->nr_channels > 0) {
+ size = snprintf(data, EDAC_MC_LABEL_LEN, "%s\n",
+ csrow->channels[1].label);
+ }
+ return size;
+}
+
+static ssize_t csrow_ch0_dimm_label_store(struct csrow_info *csrow,
+ const char *data, size_t size)
+{
+ ssize_t max_size = 0;
+
+ if (csrow->nr_channels > 0) {
+ max_size = min((ssize_t)size,(ssize_t)EDAC_MC_LABEL_LEN-1);
+ strncpy(csrow->channels[0].label, data, max_size);
+ csrow->channels[0].label[max_size] = '\0';
+ }
+ return size;
+}
+
+static ssize_t csrow_ch1_dimm_label_store(struct csrow_info *csrow,
+ const char *data, size_t size)
+{
+ ssize_t max_size = 0;
+
+ if (csrow->nr_channels > 1) {
+ max_size = min((ssize_t)size,(ssize_t)EDAC_MC_LABEL_LEN-1);
+ strncpy(csrow->channels[1].label, data, max_size);
+ csrow->channels[1].label[max_size] = '\0';
+ }
+ return max_size;
+}
+
+static ssize_t csrow_ue_count_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%u\n", csrow->ue_count);
+}
+
+static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%u\n", csrow->ce_count);
+}
+
+static ssize_t csrow_ch0_ce_count_show(struct csrow_info *csrow, char *data)
+{
+ ssize_t size = 0;
+
+ if (csrow->nr_channels > 0) {
+ size = sprintf(data,"%u\n", csrow->channels[0].ce_count);
+ }
+ return size;
+}
+
+static ssize_t csrow_ch1_ce_count_show(struct csrow_info *csrow, char *data)
+{
+ ssize_t size = 0;
+
+ if (csrow->nr_channels > 1) {
+ size = sprintf(data,"%u\n", csrow->channels[1].ce_count);
+ }
+ return size;
+}
+
+static ssize_t csrow_size_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%u\n", PAGES_TO_MiB(csrow->nr_pages));
+}
+
+static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%s\n", mem_types[csrow->mtype]);
+}
+
+static ssize_t csrow_dev_type_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%s\n", dev_types[csrow->dtype]);
+}
+
+static ssize_t csrow_edac_mode_show(struct csrow_info *csrow, char *data)
+{
+ return sprintf(data,"%s\n", edac_caps[csrow->edac_mode]);
+}
+
+struct csrowdev_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct csrow_info *,char *);
+ ssize_t (*store)(struct csrow_info *, const char *,size_t);
+};
+
+#define to_csrow(k) container_of(k, struct csrow_info, kobj)
+#define to_csrowdev_attr(a) container_of(a, struct csrowdev_attribute, attr)
+
+/* Set of show/store higher level functions for csrow objects */
+static ssize_t csrowdev_show(struct kobject *kobj, struct attribute *attr,
+ char *buffer)
+{
+ struct csrow_info *csrow = to_csrow(kobj);
+ struct csrowdev_attribute *csrowdev_attr = to_csrowdev_attr(attr);
+
+ if (csrowdev_attr->show)
+ return csrowdev_attr->show(csrow, buffer);
+ return -EIO;
+}
+
+static ssize_t csrowdev_store(struct kobject *kobj, struct attribute *attr,
+ const char *buffer, size_t count)
+{
+ struct csrow_info *csrow = to_csrow(kobj);
+ struct csrowdev_attribute * csrowdev_attr = to_csrowdev_attr(attr);
+
+ if (csrowdev_attr->store)
+ return csrowdev_attr->store(csrow, buffer, count);
+ return -EIO;
+}
+
+static struct sysfs_ops csrowfs_ops = {
+ .show = csrowdev_show,
+ .store = csrowdev_store
+};
+
+#define CSROWDEV_ATTR(_name,_mode,_show,_store) \
+struct csrowdev_attribute attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .show = _show, \
+ .store = _store, \
+};
+
+/* cwrow<id>/attribute files */
+CSROWDEV_ATTR(size_mb,S_IRUGO,csrow_size_show,NULL);
+CSROWDEV_ATTR(dev_type,S_IRUGO,csrow_dev_type_show,NULL);
+CSROWDEV_ATTR(mem_type,S_IRUGO,csrow_mem_type_show,NULL);
+CSROWDEV_ATTR(edac_mode,S_IRUGO,csrow_edac_mode_show,NULL);
+CSROWDEV_ATTR(ue_count,S_IRUGO,csrow_ue_count_show,NULL);
+CSROWDEV_ATTR(ce_count,S_IRUGO,csrow_ce_count_show,NULL);
+CSROWDEV_ATTR(ch0_ce_count,S_IRUGO,csrow_ch0_ce_count_show,NULL);
+CSROWDEV_ATTR(ch1_ce_count,S_IRUGO,csrow_ch1_ce_count_show,NULL);
+
+/* control/attribute files */
+CSROWDEV_ATTR(ch0_dimm_label,S_IRUGO|S_IWUSR,
+ csrow_ch0_dimm_label_show,
+ csrow_ch0_dimm_label_store);
+CSROWDEV_ATTR(ch1_dimm_label,S_IRUGO|S_IWUSR,
+ csrow_ch1_dimm_label_show,
+ csrow_ch1_dimm_label_store);
+
+
+/* Attributes of the CSROW<id> object */
+static struct csrowdev_attribute *csrow_attr[] = {
+ &attr_dev_type,
+ &attr_mem_type,
+ &attr_edac_mode,
+ &attr_size_mb,
+ &attr_ue_count,
+ &attr_ce_count,
+ &attr_ch0_ce_count,
+ &attr_ch1_ce_count,
+ &attr_ch0_dimm_label,
+ &attr_ch1_dimm_label,
+ NULL,
+};
+
+
+/* No memory to release */
+static void edac_csrow_instance_release(struct kobject *kobj)
+{
+ debugf1("EDAC MC: " __FILE__ ": %s()\n", __func__);
+}
+
+static struct kobj_type ktype_csrow = {
+ .release = edac_csrow_instance_release,
+ .sysfs_ops = &csrowfs_ops,
+ .default_attrs = (struct attribute **) csrow_attr,
+};
+
+/* Create a CSROW object under specifed edac_mc_device */
+static int edac_create_csrow_object(struct kobject *edac_mci_kobj,
+ struct csrow_info *csrow, int index )
+{
+ int err = 0;
+
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+
+ memset(&csrow->kobj, 0, sizeof(csrow->kobj));
+
+ /* generate ..../edac/mc/mc<id>/csrow<index> */
+
+ kobject_init(&csrow->kobj);
+ csrow->kobj.parent = edac_mci_kobj;
+ csrow->kobj.ktype = &ktype_csrow;
+
+ /* name this instance of csrow<id> */
+ err = kobject_set_name(&csrow->kobj,"csrow%d",index);
+ if (!err) {
+ /* Instanstiate the csrow object */
+ err = kobject_register(&csrow->kobj);
+ if (err)
+ debugf0("Failed to register CSROW%d\n",index);
+ else
+ debugf0("Registered CSROW%d\n",index);
+ }
+
+ return err;
+}
+
+/* sysfs data structures and methods for the MCI kobjects */
+
+static ssize_t mci_reset_counters_store(struct mem_ctl_info *mci,
+ const char *data, size_t count )
+{
+ int row, chan;
+
+ mci->ue_noinfo_count = 0;
+ mci->ce_noinfo_count = 0;
+ mci->ue_count = 0;
+ mci->ce_count = 0;
+ for (row = 0; row < mci->nr_csrows; row++) {
+ struct csrow_info *ri = &mci->csrows[row];
+
+ ri->ue_count = 0;
+ ri->ce_count = 0;
+ for (chan = 0; chan < ri->nr_channels; chan++)
+ ri->channels[chan].ce_count = 0;
+ }
+ mci->start_time = jiffies;
+
+ return count;
+}
+
+static ssize_t mci_ue_count_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%d\n", mci->ue_count);
+}
+
+static ssize_t mci_ce_count_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%d\n", mci->ce_count);
+}
+
+static ssize_t mci_ce_noinfo_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%d\n", mci->ce_noinfo_count);
+}
+
+static ssize_t mci_ue_noinfo_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%d\n", mci->ue_noinfo_count);
+}
+
+static ssize_t mci_seconds_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%ld\n", (jiffies - mci->start_time) / HZ);
+}
+
+static ssize_t mci_mod_name_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%s %s\n", mci->mod_name, mci->mod_ver);
+}
+
+static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
+{
+ return sprintf(data,"%s\n", mci->ctl_name);
+}
+
+static int mci_output_edac_cap(char *buf, unsigned long edac_cap)
+{
+ char *p = buf;
+ int bit_idx;
+
+ for (bit_idx = 0; bit_idx < 8 * sizeof(edac_cap); bit_idx++) {
+ if ((edac_cap >> bit_idx) & 0x1)
+ p += sprintf(p, "%s ", edac_caps[bit_idx]);
+ }
+
+ return p - buf;
+}
+
+static ssize_t mci_edac_capability_show(struct mem_ctl_info *mci, char *data)
+{
+ char *p = data;
+
+ p += mci_output_edac_cap(p,mci->edac_ctl_cap);
+ p += sprintf(p, "\n");
+
+ return p - data;
+}
+
+static ssize_t mci_edac_current_capability_show(struct mem_ctl_info *mci,
+ char *data)
+{
+ char *p = data;
+
+ p += mci_output_edac_cap(p,mci->edac_cap);
+ p += sprintf(p, "\n");
+
+ return p - data;
+}
+
+static int mci_output_mtype_cap(char *buf, unsigned long mtype_cap)
+{
+ char *p = buf;
+ int bit_idx;
+
+ for (bit_idx = 0; bit_idx < 8 * sizeof(mtype_cap); bit_idx++) {
+ if ((mtype_cap >> bit_idx) & 0x1)
+ p += sprintf(p, "%s ", mem_types[bit_idx]);
+ }
+
+ return p - buf;
+}
+
+static ssize_t mci_supported_mem_type_show(struct mem_ctl_info *mci, char *data)
+{
+ char *p = data;
+
+ p += mci_output_mtype_cap(p,mci->mtype_cap);
+ p += sprintf(p, "\n");
+
+ return p - data;
+}
+
+static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
+{
+ int total_pages, csrow_idx;
+
+ for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
+ csrow_idx++) {
+ struct csrow_info *csrow = &mci->csrows[csrow_idx];
+
+ if (!csrow->nr_pages)
+ continue;
+ total_pages += csrow->nr_pages;
+ }
+
+ return sprintf(data,"%u\n", PAGES_TO_MiB(total_pages));
+}
+
+struct mcidev_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct mem_ctl_info *,char *);
+ ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
+};
+
+#define to_mci(k) container_of(k, struct mem_ctl_info, edac_mci_kobj)
+#define to_mcidev_attr(a) container_of(a, struct mcidev_attribute, attr)
+
+static ssize_t mcidev_show(struct kobject *kobj, struct attribute *attr,
+ char *buffer)
+{
+ struct mem_ctl_info *mem_ctl_info = to_mci(kobj);
+ struct mcidev_attribute * mcidev_attr = to_mcidev_attr(attr);
+
+ if (mcidev_attr->show)
+ return mcidev_attr->show(mem_ctl_info, buffer);
+ return -EIO;
+}
+
+static ssize_t mcidev_store(struct kobject *kobj, struct attribute *attr,
+ const char *buffer, size_t count)
+{
+ struct mem_ctl_info *mem_ctl_info = to_mci(kobj);
+ struct mcidev_attribute * mcidev_attr = to_mcidev_attr(attr);
+
+ if (mcidev_attr->store)
+ return mcidev_attr->store(mem_ctl_info, buffer, count);
+ return -EIO;
+}
+
+static struct sysfs_ops mci_ops = {
+ .show = mcidev_show,
+ .store = mcidev_store
+};
+
+#define MCIDEV_ATTR(_name,_mode,_show,_store) \
+struct mcidev_attribute mci_attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .show = _show, \
+ .store = _store, \
+};
+
+/* Control file */
+MCIDEV_ATTR(reset_counters,S_IWUSR,NULL,mci_reset_counters_store);
+
+/* Attribute files */
+MCIDEV_ATTR(mc_name,S_IRUGO,mci_ctl_name_show,NULL);
+MCIDEV_ATTR(module_name,S_IRUGO,mci_mod_name_show,NULL);
+MCIDEV_ATTR(edac_capability,S_IRUGO,mci_edac_capability_show,NULL);
+MCIDEV_ATTR(size_mb,S_IRUGO,mci_size_mb_show,NULL);
+MCIDEV_ATTR(seconds_since_reset,S_IRUGO,mci_seconds_show,NULL);
+MCIDEV_ATTR(ue_noinfo_count,S_IRUGO,mci_ue_noinfo_show,NULL);
+MCIDEV_ATTR(ce_noinfo_count,S_IRUGO,mci_ce_noinfo_show,NULL);
+MCIDEV_ATTR(ue_count,S_IRUGO,mci_ue_count_show,NULL);
+MCIDEV_ATTR(ce_count,S_IRUGO,mci_ce_count_show,NULL);
+MCIDEV_ATTR(edac_current_capability,S_IRUGO,
+ mci_edac_current_capability_show,NULL);
+MCIDEV_ATTR(supported_mem_type,S_IRUGO,
+ mci_supported_mem_type_show,NULL);
+
+
+static struct mcidev_attribute *mci_attr[] = {
+ &mci_attr_reset_counters,
+ &mci_attr_module_name,
+ &mci_attr_mc_name,
+ &mci_attr_edac_capability,
+ &mci_attr_edac_current_capability,
+ &mci_attr_supported_mem_type,
+ &mci_attr_size_mb,
+ &mci_attr_seconds_since_reset,
+ &mci_attr_ue_noinfo_count,
+ &mci_attr_ce_noinfo_count,
+ &mci_attr_ue_count,
+ &mci_attr_ce_count,
+ NULL
+};
+
+
+/*
+ * Release of a MC controlling instance
+ */
+static void edac_mci_instance_release(struct kobject *kobj)
+{
+ struct mem_ctl_info *mci;
+ mci = container_of(kobj,struct mem_ctl_info,edac_mci_kobj);
+
+ debugf0("MC: " __FILE__ ": %s() idx=%d calling kfree\n",
+ __func__, mci->mc_idx);
+
+ kfree(mci);
+}
+
+static struct kobj_type ktype_mci = {
+ .release = edac_mci_instance_release,
+ .sysfs_ops = &mci_ops,
+ .default_attrs = (struct attribute **) mci_attr,
+};
+
+#define EDAC_DEVICE_SYMLINK "device"
+
+/*
+ * Create a new Memory Controller kobject instance,
+ * mc<id> under the 'mc' directory
+ *
+ * Return:
+ * 0 Success
+ * !0 Failure
+ */
+static int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
+{
+ int i;
+ int err;
+ struct csrow_info *csrow;
+ struct kobject *edac_mci_kobj=&mci->edac_mci_kobj;
+
+ debugf0("MC: " __FILE__ ": %s() idx=%d\n", __func__, mci->mc_idx);
+
+ memset(edac_mci_kobj, 0, sizeof(*edac_mci_kobj));
+ kobject_init(edac_mci_kobj);
+
+ /* set the name of the mc<id> object */
+ err = kobject_set_name(edac_mci_kobj,"mc%d",mci->mc_idx);
+ if (err)
+ return err;
+
+ /* link to our parent the '..../edac/mc' object */
+ edac_mci_kobj->parent = &edac_memctrl_kobj;
+ edac_mci_kobj->ktype = &ktype_mci;
+
+ /* register the mc<id> kobject */
+ err = kobject_register(edac_mci_kobj);
+ if (err)
+ return err;
+
+ /* create a symlink for the device */
+ err = sysfs_create_link(edac_mci_kobj, &mci->pdev->dev.kobj,
+ EDAC_DEVICE_SYMLINK);
+ if (err) {
+ kobject_unregister(edac_mci_kobj);
+ return err;
+ }
+
+ /* Make directories for each CSROW object
+ * under the mc<id> kobject
+ */
+ for (i = 0; i < mci->nr_csrows; i++) {
+
+ csrow = &mci->csrows[i];
+
+ /* Only expose populated CSROWs */
+ if (csrow->nr_pages > 0) {
+ err = edac_create_csrow_object(edac_mci_kobj,csrow,i);
+ if (err)
+ goto fail;
+ }
+ }
+
+ /* Mark this MCI instance as having sysfs entries */
+ mci->sysfs_active = MCI_SYSFS_ACTIVE;
+
+ return 0;
+
+
+ /* CSROW error: backout what has already been registered, */
+fail:
+ for ( i--; i >= 0; i--) {
+ if (csrow->nr_pages > 0) {
+ kobject_unregister(&mci->csrows[i].kobj);
+ kobject_put(&mci->csrows[i].kobj);
+ }
+ }
+
+ kobject_unregister(edac_mci_kobj);
+ kobject_put(edac_mci_kobj);
+
+ return err;
+}
+
+/*
+ * remove a Memory Controller instance
+ */
+static void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
+{
+ int i;
+
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+
+ /* remove all csrow kobjects */
+ for (i = 0; i < mci->nr_csrows; i++) {
+ if (mci->csrows[i].nr_pages > 0) {
+ kobject_unregister(&mci->csrows[i].kobj);
+ kobject_put(&mci->csrows[i].kobj);
+ }
+ }
+
+ sysfs_remove_link(&mci->edac_mci_kobj, EDAC_DEVICE_SYMLINK);
+
+ kobject_unregister(&mci->edac_mci_kobj);
+ kobject_put(&mci->edac_mci_kobj);
+}
+
+/* END OF sysfs data and methods */
+
+#ifdef CONFIG_EDAC_DEBUG
+
+EXPORT_SYMBOL(edac_mc_dump_channel);
+
+void edac_mc_dump_channel(struct channel_info *chan)
+{
+ debugf4("\tchannel = %p\n", chan);
+ debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
+ debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
+ debugf4("\tchannel->label = '%s'\n", chan->label);
+ debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+}
+
+
+EXPORT_SYMBOL(edac_mc_dump_csrow);
+
+void edac_mc_dump_csrow(struct csrow_info *csrow)
+{
+ debugf4("\tcsrow = %p\n", csrow);
+ debugf4("\tcsrow->csrow_idx = %d\n", csrow->csrow_idx);
+ debugf4("\tcsrow->first_page = 0x%lx\n",
+ csrow->first_page);
+ debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
+ debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
+ debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
+ debugf4("\tcsrow->nr_channels = %d\n",
+ csrow->nr_channels);
+ debugf4("\tcsrow->channels = %p\n", csrow->channels);
+ debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
+}
+
+
+EXPORT_SYMBOL(edac_mc_dump_mci);
+
+void edac_mc_dump_mci(struct mem_ctl_info *mci)
+{
+ debugf3("\tmci = %p\n", mci);
+ debugf3("\tmci->mtype_cap = %lx\n", mci->mtype_cap);
+ debugf3("\tmci->edac_ctl_cap = %lx\n", mci->edac_ctl_cap);
+ debugf3("\tmci->edac_cap = %lx\n", mci->edac_cap);
+ debugf4("\tmci->edac_check = %p\n", mci->edac_check);
+ debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
+ mci->nr_csrows, mci->csrows);
+ debugf3("\tpdev = %p\n", mci->pdev);
+ debugf3("\tmod_name:ctl_name = %s:%s\n",
+ mci->mod_name, mci->ctl_name);
+ debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
+}
+
+
+#endif /* CONFIG_EDAC_DEBUG */
+
+/* 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
+ * Adjust 'ptr' so that its alignment is at least as stringent as what the
+ * compiler would provide for X and return the aligned result.
+ *
+ * If 'size' is a constant, the compiler will optimize this whole function
+ * down to either a no-op or the addition of a constant to the value of 'ptr'.
+ */
+static inline char * align_ptr (void *ptr, unsigned size)
+{
+ unsigned align, r;
+
+ /* Here we assume that the alignment of a "long long" is the most
+ * stringent alignment that the compiler will ever provide by default.
+ * As far as I know, this is a reasonable assumption.
+ */
+ if (size > sizeof(long))
+ align = sizeof(long long);
+ else if (size > sizeof(int))
+ align = sizeof(long);
+ else if (size > sizeof(short))
+ align = sizeof(int);
+ else if (size > sizeof(char))
+ align = sizeof(short);
+ else
+ return (char *) ptr;
+
+ r = size % align;
+
+ if (r == 0)
+ return (char *) ptr;
+
+ return (char *) (((unsigned long) ptr) + align - r);
+}
+
+
+EXPORT_SYMBOL(edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate a struct mem_ctl_info structure
+ * @size_pvt: size of private storage needed
+ * @nr_csrows: Number of CWROWS needed for this MC
+ * @nr_chans: Number of channels for the MC
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ * NULL allocation failed
+ * struct mem_ctl_info pointer
+ */
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+ unsigned nr_chans)
+{
+ struct mem_ctl_info *mci;
+ struct csrow_info *csi, *csrow;
+ struct channel_info *chi, *chp, *chan;
+ void *pvt;
+ unsigned size;
+ int row, chn;
+
+ /* Figure out the offsets of the various items from the start of an mc
+ * structure. We want the alignment of each item to be at least as
+ * stringent as what the compiler would provide if we could simply
+ * hardcode everything into a single struct.
+ */
+ mci = (struct mem_ctl_info *) 0;
+ csi = (struct csrow_info *)align_ptr(&mci[1], sizeof(*csi));
+ chi = (struct channel_info *)
+ align_ptr(&csi[nr_csrows], sizeof(*chi));
+ pvt = align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
+ size = ((unsigned long) pvt) + sz_pvt;
+
+ if ((mci = kmalloc(size, GFP_KERNEL)) == NULL)
+ return NULL;
+
+ /* Adjust pointers so they point within the memory we just allocated
+ * rather than an imaginary chunk of memory located at address 0.
+ */
+ csi = (struct csrow_info *) (((char *) mci) + ((unsigned long) csi));
+ chi = (struct channel_info *) (((char *) mci) + ((unsigned long) chi));
+ pvt = sz_pvt ? (((char *) mci) + ((unsigned long) pvt)) : NULL;
+
+ memset(mci, 0, size); /* clear all fields */
+
+ mci->csrows = csi;
+ mci->pvt_info = pvt;
+ mci->nr_csrows = nr_csrows;
+
+ for (row = 0; row < nr_csrows; row++) {
+ csrow = &csi[row];
+ csrow->csrow_idx = row;
+ csrow->mci = mci;
+ csrow->nr_channels = nr_chans;
+ chp = &chi[row * nr_chans];
+ csrow->channels = chp;
+
+ for (chn = 0; chn < nr_chans; chn++) {
+ chan = &chp[chn];
+ chan->chan_idx = chn;
+ chan->csrow = csrow;
+ }
+ }
+
+ return mci;
+}
+
+
+EXPORT_SYMBOL(edac_mc_free);
+
+/**
+ * edac_mc_free: Free a previously allocated 'mci' structure
+ * @mci: pointer to a struct mem_ctl_info structure
+ *
+ * Free up a previously allocated mci structure
+ * A MCI structure can be in 2 states after being allocated
+ * by edac_mc_alloc().
+ * 1) Allocated in a MC driver's probe, but not yet committed
+ * 2) Allocated and committed, by a call to edac_mc_add_mc()
+ * edac_mc_add_mc() is the function that adds the sysfs entries
+ * thus, this free function must determine which state the 'mci'
+ * structure is in, then either free it directly or
+ * perform kobject cleanup by calling edac_remove_sysfs_mci_device().
+ *
+ * VOID Return
+ */
+void edac_mc_free(struct mem_ctl_info *mci)
+{
+ /* only if sysfs entries for this mci instance exist
+ * do we remove them and defer the actual kfree via
+ * the kobject 'release()' callback.
+ *
+ * Otherwise, do a straight kfree now.
+ */
+ if (mci->sysfs_active == MCI_SYSFS_ACTIVE)
+ edac_remove_sysfs_mci_device(mci);
+ else
+ kfree(mci);
+}
+
+
+
+EXPORT_SYMBOL(edac_mc_find_mci_by_pdev);
+
+struct mem_ctl_info *edac_mc_find_mci_by_pdev(struct pci_dev *pdev)
+{
+ struct mem_ctl_info *mci;
+ struct list_head *item;
+
+ debugf3("MC: " __FILE__ ": %s()\n", __func__);
+
+ list_for_each(item, &mc_devices) {
+ mci = list_entry(item, struct mem_ctl_info, link);
+
+ if (mci->pdev == pdev)
+ return mci;
+ }
+
+ return NULL;
+}
+
+static int add_mc_to_global_list (struct mem_ctl_info *mci)
+{
+ struct list_head *item, *insert_before;
+ struct mem_ctl_info *p;
+ int i;
+
+ if (list_empty(&mc_devices)) {
+ mci->mc_idx = 0;
+ insert_before = &mc_devices;
+ } else {
+ if (edac_mc_find_mci_by_pdev(mci->pdev)) {
+ printk(KERN_WARNING
+ "EDAC MC: %s (%s) %s %s already assigned %d\n",
+ mci->pdev->dev.bus_id, pci_name(mci->pdev),
+ mci->mod_name, mci->ctl_name, mci->mc_idx);
+ return 1;
+ }
+
+ insert_before = NULL;
+ i = 0;
+
+ list_for_each(item, &mc_devices) {
+ p = list_entry(item, struct mem_ctl_info, link);
+
+ if (p->mc_idx != i) {
+ insert_before = item;
+ break;
+ }
+
+ i++;
+ }
+
+ mci->mc_idx = i;
+
+ if (insert_before == NULL)
+ insert_before = &mc_devices;
+ }
+
+ list_add_tail_rcu(&mci->link, insert_before);
+ return 0;
+}
+
+
+
+EXPORT_SYMBOL(edac_mc_add_mc);
+
+/**
+ * edac_mc_add_mc: Insert the 'mci' structure into the mci global list
+ * @mci: pointer to the mci structure to be added to the list
+ *
+ * Return:
+ * 0 Success
+ * !0 Failure
+ */
+
+/* FIXME - should a warning be printed if no error detection? correction? */
+int edac_mc_add_mc(struct mem_ctl_info *mci)
+{
+ int rc = 1;
+
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+#ifdef CONFIG_EDAC_DEBUG
+ if (edac_debug_level >= 3)
+ edac_mc_dump_mci(mci);
+ if (edac_debug_level >= 4) {
+ int i;
+
+ for (i = 0; i < mci->nr_csrows; i++) {
+ int j;
+ edac_mc_dump_csrow(&mci->csrows[i]);
+ for (j = 0; j < mci->csrows[i].nr_channels; j++)
+ edac_mc_dump_channel(&mci->csrows[i].
+ channels[j]);
+ }
+ }
+#endif
+ down(&mem_ctls_mutex);
+
+ if (add_mc_to_global_list(mci))
+ goto finish;
+
+ /* set load time so that error rate can be tracked */
+ mci->start_time = jiffies;
+
+ if (edac_create_sysfs_mci_device(mci)) {
+ printk(KERN_WARNING
+ "EDAC MC%d: failed to create sysfs device\n",
+ mci->mc_idx);
+ /* FIXME - should there be an error code and unwind? */
+ goto finish;
+ }
+
+ /* Report action taken */
+ printk(KERN_INFO
+ "EDAC MC%d: Giving out device to %s %s: PCI %s\n",
+ mci->mc_idx, mci->mod_name, mci->ctl_name,
+ pci_name(mci->pdev));
+
+
+ rc = 0;
+
+finish:
+ up(&mem_ctls_mutex);
+ return rc;
+}
+
+
+
+static void complete_mc_list_del (struct rcu_head *head)
+{
+ struct mem_ctl_info *mci;
+
+ mci = container_of(head, struct mem_ctl_info, rcu);
+ INIT_LIST_HEAD(&mci->link);
+ complete(&mci->complete);
+}
+
+static void del_mc_from_global_list (struct mem_ctl_info *mci)
+{
+ list_del_rcu(&mci->link);
+ init_completion(&mci->complete);
+ call_rcu(&mci->rcu, complete_mc_list_del);
+ wait_for_completion(&mci->complete);
+}
+
+EXPORT_SYMBOL(edac_mc_del_mc);
+
+/**
+ * edac_mc_del_mc: Remove the specified mci structure from global list
+ * @mci: Pointer to struct mem_ctl_info structure
+ *
+ * Returns:
+ * 0 Success
+ * 1 Failure
+ */
+int edac_mc_del_mc(struct mem_ctl_info *mci)
+{
+ int rc = 1;
+
+ debugf0("MC%d: " __FILE__ ": %s()\n", mci->mc_idx, __func__);
+ down(&mem_ctls_mutex);
+ del_mc_from_global_list(mci);
+ printk(KERN_INFO
+ "EDAC MC%d: Removed device %d for %s %s: PCI %s\n",
+ mci->mc_idx, mci->mc_idx, mci->mod_name, mci->ctl_name,
+ pci_name(mci->pdev));
+ rc = 0;
+ up(&mem_ctls_mutex);
+
+ return rc;
+}
+
+
+EXPORT_SYMBOL(edac_mc_scrub_block);
+
+void edac_mc_scrub_block(unsigned long page, unsigned long offset,
+ u32 size)
+{
+ struct page *pg;
+ void *virt_addr;
+ unsigned long flags = 0;
+
+ debugf3("MC: " __FILE__ ": %s()\n", __func__);
+
+ /* ECC error page was not in our memory. Ignore it. */
+ if(!pfn_valid(page))
+ return;
+
+ /* Find the actual page structure then map it and fix */
+ pg = pfn_to_page(page);
+
+ if (PageHighMem(pg))
+ local_irq_save(flags);
+
+ virt_addr = kmap_atomic(pg, KM_BOUNCE_READ);
+
+ /* Perform architecture specific atomic scrub operation */
+ atomic_scrub(virt_addr + offset, size);
+
+ /* Unmap and complete */
+ kunmap_atomic(virt_addr, KM_BOUNCE_READ);
+
+ if (PageHighMem(pg))
+ local_irq_restore(flags);
+}
+
+
+/* FIXME - should return -1 */
+EXPORT_SYMBOL(edac_mc_find_csrow_by_page);
+
+int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
+ unsigned long page)
+{
+ struct csrow_info *csrows = mci->csrows;
+ int row, i;
+
+ debugf1("MC%d: " __FILE__ ": %s(): 0x%lx\n", mci->mc_idx, __func__,
+ page);
+ row = -1;
+
+ for (i = 0; i < mci->nr_csrows; i++) {
+ struct csrow_info *csrow = &csrows[i];
+
+ if (csrow->nr_pages == 0)
+ continue;
+
+ debugf3("MC%d: " __FILE__
+ ": %s(): first(0x%lx) page(0x%lx)"
+ " last(0x%lx) mask(0x%lx)\n", mci->mc_idx,
+ __func__, csrow->first_page, page,
+ csrow->last_page, csrow->page_mask);
+
+ if ((page >= csrow->first_page) &&
+ (page <= csrow->last_page) &&
+ ((page & csrow->page_mask) ==
+ (csrow->first_page & csrow->page_mask))) {
+ row = i;
+ break;
+ }
+ }
+
+ if (row == -1)
+ printk(KERN_ERR
+ "EDAC MC%d: could not look up page error address %lx\n",
+ mci->mc_idx, (unsigned long) page);
+
+ return row;
+}
+
+
+EXPORT_SYMBOL(edac_mc_handle_ce);
+
+/* FIXME - setable log (warning/emerg) levels */
+/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
+void edac_mc_handle_ce(struct mem_ctl_info *mci,
+ unsigned long page_frame_number,
+ unsigned long offset_in_page,
+ unsigned long syndrome, int row, int channel,
+ const char *msg)
+{
+ unsigned long remapped_page;
+
+ debugf3("MC%d: " __FILE__ ": %s()\n", mci->mc_idx, __func__);
+
+ /* FIXME - maybe make panic on INTERNAL ERROR an option */
+ if (row >= mci->nr_csrows || row < 0) {
+ /* something is wrong */
+ printk(KERN_ERR
+ "EDAC MC%d: INTERNAL ERROR: row out of range (%d >= %d)\n",
+ mci->mc_idx, row, mci->nr_csrows);
+ edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+ return;
+ }
+ if (channel >= mci->csrows[row].nr_channels || channel < 0) {
+ /* something is wrong */
+ printk(KERN_ERR
+ "EDAC MC%d: INTERNAL ERROR: channel out of range "
+ "(%d >= %d)\n",
+ mci->mc_idx, channel, mci->csrows[row].nr_channels);
+ edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+ return;
+ }
+
+ if (log_ce)
+ /* FIXME - put in DIMM location */
+ printk(KERN_WARNING
+ "EDAC MC%d: CE page 0x%lx, offset 0x%lx,"
+ " grain %d, syndrome 0x%lx, row %d, channel %d,"
+ " label \"%s\": %s\n", mci->mc_idx,
+ page_frame_number, offset_in_page,
+ mci->csrows[row].grain, syndrome, row, channel,
+ mci->csrows[row].channels[channel].label, msg);
+
+ mci->ce_count++;
+ mci->csrows[row].ce_count++;
+ mci->csrows[row].channels[channel].ce_count++;
+
+ if (mci->scrub_mode & SCRUB_SW_SRC) {
+ /*
+ * Some MC's can remap memory so that it is still available
+ * at a different address when PCI devices map into memory.
+ * MC's that can't do this lose the memory where PCI devices
+ * are mapped. This mapping is MC dependant and so we call
+ * back into the MC driver for it to map the MC page to
+ * a physical (CPU) page which can then be mapped to a virtual
+ * page - which can then be scrubbed.
+ */
+ remapped_page = mci->ctl_page_to_phys ?
+ mci->ctl_page_to_phys(mci, page_frame_number) :
+ page_frame_number;
+
+ edac_mc_scrub_block(remapped_page, offset_in_page,
+ mci->csrows[row].grain);
+ }
+}
+
+
+EXPORT_SYMBOL(edac_mc_handle_ce_no_info);
+
+void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+ const char *msg)
+{
+ if (log_ce)
+ printk(KERN_WARNING
+ "EDAC MC%d: CE - no information available: %s\n",
+ mci->mc_idx, msg);
+ mci->ce_noinfo_count++;
+ mci->ce_count++;
+}
+
+
+EXPORT_SYMBOL(edac_mc_handle_ue);
+
+void edac_mc_handle_ue(struct mem_ctl_info *mci,
+ unsigned long page_frame_number,
+ unsigned long offset_in_page, int row,
+ const char *msg)
+{
+ int len = EDAC_MC_LABEL_LEN * 4;
+ char labels[len + 1];
+ char *pos = labels;
+ int chan;
+ int chars;
+
+ debugf3("MC%d: " __FILE__ ": %s()\n", mci->mc_idx, __func__);
+
+ /* FIXME - maybe make panic on INTERNAL ERROR an option */
+ if (row >= mci->nr_csrows || row < 0) {
+ /* something is wrong */
+ printk(KERN_ERR
+ "EDAC MC%d: INTERNAL ERROR: row out of range (%d >= %d)\n",
+ mci->mc_idx, row, mci->nr_csrows);
+ edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
+ return;
+ }
+
+ chars = snprintf(pos, len + 1, "%s",
+ mci->csrows[row].channels[0].label);
+ len -= chars;
+ pos += chars;
+ for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
+ chan++) {
+ chars = snprintf(pos, len + 1, ":%s",
+ mci->csrows[row].channels[chan].label);
+ len -= chars;
+ pos += chars;
+ }
+
+ if (log_ue)
+ printk(KERN_EMERG
+ "EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, row %d,"
+ " labels \"%s\": %s\n", mci->mc_idx,
+ page_frame_number, offset_in_page,
+ mci->csrows[row].grain, row, labels, msg);
+
+ if (panic_on_ue)
+ panic
+ ("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, row %d,"
+ " labels \"%s\": %s\n", mci->mc_idx,
+ page_frame_number, offset_in_page,
+ mci->csrows[row].grain, row, labels, msg);
+
+ mci->ue_count++;
+ mci->csrows[row].ue_count++;
+}
+
+
+EXPORT_SYMBOL(edac_mc_handle_ue_no_info);
+
+void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+ const char *msg)
+{
+ if (panic_on_ue)
+ panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+
+ if (log_ue)
+ printk(KERN_WARNING
+ "EDAC MC%d: UE - no information available: %s\n",
+ mci->mc_idx, msg);
+ mci->ue_noinfo_count++;
+ mci->ue_count++;
+}
+
+
+#ifdef CONFIG_PCI
+
+static u16 get_pci_parity_status(struct pci_dev *dev, int secondary)
+{
+ int where;
+ u16 status;
+
+ where = secondary ? PCI_SEC_STATUS : PCI_STATUS;
+ pci_read_config_word(dev, where, &status);
+
+ /* If we get back 0xFFFF then we must suspect that the card has been pulled but
+ the Linux PCI layer has not yet finished cleaning up. We don't want to report
+ on such devices */
+
+ if (status == 0xFFFF) {
+ u32 sanity;
+ pci_read_config_dword(dev, 0, &sanity);
+ if (sanity == 0xFFFFFFFF)
+ return 0;
+ }
+ status &= PCI_STATUS_DETECTED_PARITY | PCI_STATUS_SIG_SYSTEM_ERROR |
+ PCI_STATUS_PARITY;
+
+ if (status)
+ /* reset only the bits we are interested in */
+ pci_write_config_word(dev, where, status);
+
+ return status;
+}
+
+typedef void (*pci_parity_check_fn_t) (struct pci_dev *dev);
+
+/* Clear any PCI parity errors logged by this device. */
+static void edac_pci_dev_parity_clear( struct pci_dev *dev )
+{
+ u8 header_type;
+
+ get_pci_parity_status(dev, 0);
+
+ /* read the device TYPE, looking for bridges */
+ pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type);
+
+ if ((header_type & 0x7F) == PCI_HEADER_TYPE_BRIDGE)
+ get_pci_parity_status(dev, 1);
+}
+
+/*
+ * PCI Parity polling
+ *
+ */
+static void edac_pci_dev_parity_test(struct pci_dev *dev)
+{
+ u16 status;
+ u8 header_type;
+
+ /* read the STATUS register on this device
+ */
+ status = get_pci_parity_status(dev, 0);
+
+ debugf2("PCI STATUS= 0x%04x %s\n", status, dev->dev.bus_id );
+
+ /* check the status reg for errors */
+ if (status) {
+ if (status & (PCI_STATUS_SIG_SYSTEM_ERROR))
+ printk(KERN_CRIT
+ "EDAC PCI- "
+ "Signaled System Error on %s\n",
+ pci_name (dev));
+
+ if (status & (PCI_STATUS_PARITY)) {
+ printk(KERN_CRIT
+ "EDAC PCI- "
+ "Master Data Parity Error on %s\n",
+ pci_name (dev));
+
+ atomic_inc(&pci_parity_count);
+ }
+
+ if (status & (PCI_STATUS_DETECTED_PARITY)) {
+ printk(KERN_CRIT
+ "EDAC PCI- "
+ "Detected Parity Error on %s\n",
+ pci_name (dev));
+
+ atomic_inc(&pci_parity_count);
+ }
+ }
+
+ /* read the device TYPE, looking for bridges */
+ pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type);
+
+ debugf2("PCI HEADER TYPE= 0x%02x %s\n", header_type, dev->dev.bus_id );
+
+ if ((header_type & 0x7F) == PCI_HEADER_TYPE_BRIDGE) {
+ /* On bridges, need to examine secondary status register */
+ status = get_pci_parity_status(dev, 1);
+
+ debugf2("PCI SEC_STATUS= 0x%04x %s\n",
+ status, dev->dev.bus_id );
+
+ /* check the secondary status reg for errors */
+ if (status) {
+ if (status & (PCI_STATUS_SIG_SYSTEM_ERROR))
+ printk(KERN_CRIT
+ "EDAC PCI-Bridge- "
+ "Signaled System Error on %s\n",
+ pci_name (dev));
+
+ if (status & (PCI_STATUS_PARITY)) {
+ printk(KERN_CRIT
+ "EDAC PCI-Bridge- "
+ "Master Data Parity Error on %s\n",
+ pci_name (dev));
+
+ atomic_inc(&pci_parity_count);
+ }
+
+ if (status & (PCI_STATUS_DETECTED_PARITY)) {
+ printk(KERN_CRIT
+ "EDAC PCI-Bridge- "
+ "Detected Parity Error on %s\n",
+ pci_name (dev));
+
+ atomic_inc(&pci_parity_count);
+ }
+ }
+ }
+}
+
+/*
+ * check_dev_on_list: Scan for a PCI device on a white/black list
+ * @list: an EDAC &edac_pci_device_list white/black list pointer
+ * @free_index: index of next free entry on the list
+ * @pci_dev: PCI Device pointer
+ *
+ * see if list contains the device.
+ *
+ * Returns: 0 not found
+ * 1 found on list
+ */
+static int check_dev_on_list(struct edac_pci_device_list *list, int free_index,
+ struct pci_dev *dev)
+{
+ int i;
+ int rc = 0; /* Assume not found */
+ unsigned short vendor=dev->vendor;
+ unsigned short device=dev->device;
+
+ /* Scan the list, looking for a vendor/device match
+ */
+ for (i = 0; i < free_index; i++, list++ ) {
+ if ( (list->vendor == vendor ) &&
+ (list->device == device )) {
+ rc = 1;
+ break;
+ }
+ }
+
+ return rc;
+}
+
+/*
+ * pci_dev parity list iterator
+ * Scan the PCI device list for one iteration, looking for SERRORs
+ * Master Parity ERRORS or Parity ERRORs on primary or secondary devices
+ */
+static inline void edac_pci_dev_parity_iterator(pci_parity_check_fn_t fn)
+{
+ struct pci_dev *dev=NULL;
+
+ /* request for kernel access to the next PCI device, if any,
+ * and while we are looking at it have its reference count
+ * bumped until we are done with it
+ */
+ while((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+
+ /* if whitelist exists then it has priority, so only scan those
+ * devices on the whitelist
+ */
+ if (pci_whitelist_count > 0 ) {
+ if (check_dev_on_list(pci_whitelist,
+ pci_whitelist_count, dev))
+ fn(dev);
+ } else {
+ /*
+ * if no whitelist, then check if this devices is
+ * blacklisted
+ */
+ if (!check_dev_on_list(pci_blacklist,
+ pci_blacklist_count, dev))
+ fn(dev);
+ }
+ }
+}
+
+static void do_pci_parity_check(void)
+{
+ unsigned long flags;
+ int before_count;
+
+ debugf3("MC: " __FILE__ ": %s()\n", __func__);
+
+ if (!check_pci_parity)
+ return;
+
+ before_count = atomic_read(&pci_parity_count);
+
+ /* scan all PCI devices looking for a Parity Error on devices and
+ * bridges
+ */
+ local_irq_save(flags);
+ edac_pci_dev_parity_iterator(edac_pci_dev_parity_test);
+ local_irq_restore(flags);
+
+ /* Only if operator has selected panic on PCI Error */
+ if (panic_on_pci_parity) {
+ /* If the count is different 'after' from 'before' */
+ if (before_count != atomic_read(&pci_parity_count))
+ panic("EDAC: PCI Parity Error");
+ }
+}
+
+
+static inline void clear_pci_parity_errors(void)
+{
+ /* Clear any PCI bus parity errors that devices initially have logged
+ * in their registers.
+ */
+ edac_pci_dev_parity_iterator(edac_pci_dev_parity_clear);
+}
+
+
+#else /* CONFIG_PCI */
+
+
+static inline void do_pci_parity_check(void)
+{
+ /* no-op */
+}
+
+
+static inline void clear_pci_parity_errors(void)
+{
+ /* no-op */
+}
+
+
+#endif /* CONFIG_PCI */
+
+/*
+ * Iterate over all MC instances and check for ECC, et al, errors
+ */
+static inline void check_mc_devices (void)
+{
+ unsigned long flags;
+ struct list_head *item;
+ struct mem_ctl_info *mci;
+
+ debugf3("MC: " __FILE__ ": %s()\n", __func__);
+
+ /* during poll, have interrupts off */
+ local_irq_save(flags);
+
+ list_for_each(item, &mc_devices) {
+ mci = list_entry(item, struct mem_ctl_info, link);
+
+ if (mci->edac_check != NULL)
+ mci->edac_check(mci);
+ }
+
+ local_irq_restore(flags);
+}
+
+
+/*
+ * Check MC status every poll_msec.
+ * Check PCI status every poll_msec as well.
+ *
+ * This where the work gets done for edac.
+ *
+ * SMP safe, doesn't use NMI, and auto-rate-limits.
+ */
+static void do_edac_check(void)
+{
+
+ debugf3("MC: " __FILE__ ": %s()\n", __func__);
+
+ check_mc_devices();
+
+ do_pci_parity_check();
+}
+
+
+/*
+ * EDAC thread state information
+ */
+struct bs_thread_info
+{
+ struct task_struct *task;
+ struct completion *event;
+ char *name;
+ void (*run)(void);
+};
+
+static struct bs_thread_info bs_thread;
+
+/*
+ * edac_kernel_thread
+ * This the kernel thread that processes edac operations
+ * in a normal thread environment
+ */
+static int edac_kernel_thread(void *arg)
+{
+ struct bs_thread_info *thread = (struct bs_thread_info *) arg;
+
+ /* detach thread */
+ daemonize(thread->name);
+
+ current->exit_signal = SIGCHLD;
+ allow_signal(SIGKILL);
+ thread->task = current;
+
+ /* indicate to starting task we have started */
+ complete(thread->event);
+
+ /* loop forever, until we are told to stop */
+ while(thread->run != NULL) {
+ void (*run)(void);
+
+ /* call the function to check the memory controllers */
+ run = thread->run;
+ if (run)
+ run();
+
+ if (signal_pending(current))
+ flush_signals(current);
+
+ /* ensure we are interruptable */
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ /* goto sleep for the interval */
+ schedule_timeout((HZ * poll_msec) / 1000);
+ try_to_freeze();
+ }
+
+ /* notify waiter that we are exiting */
+ complete(thread->event);
+
+ return 0;
+}
+
+/*
+ * edac_mc_init
+ * module initialization entry point
+ */
+static int __init edac_mc_init(void)
+{
+ int ret;
+ struct completion event;
+
+ printk(KERN_INFO "MC: " __FILE__ " version " EDAC_MC_VERSION "\n");
+
+ /*
+ * Harvest and clear any boot/initialization PCI parity errors
+ *
+ * FIXME: This only clears errors logged by devices present at time of
+ * module initialization. We should also do an initial clear
+ * of each newly hotplugged device.
+ */
+ clear_pci_parity_errors();
+
+ /* perform check for first time to harvest boot leftovers */
+ do_edac_check();
+
+ /* Create the MC sysfs entires */
+ if (edac_sysfs_memctrl_setup()) {
+ printk(KERN_ERR "EDAC MC: Error initializing sysfs code\n");
+ return -ENODEV;
+ }
+
+ /* Create the PCI parity sysfs entries */
+ if (edac_sysfs_pci_setup()) {
+ edac_sysfs_memctrl_teardown();
+ printk(KERN_ERR "EDAC PCI: Error initializing sysfs code\n");
+ return -ENODEV;
+ }
+
+ /* Create our kernel thread */
+ init_completion(&event);
+ bs_thread.event = &event;
+ bs_thread.name = "kedac";
+ bs_thread.run = do_edac_check;
+
+ /* create our kernel thread */
+ ret = kernel_thread(edac_kernel_thread, &bs_thread, CLONE_KERNEL);
+ if (ret < 0) {
+ /* remove the sysfs entries */
+ edac_sysfs_memctrl_teardown();
+ edac_sysfs_pci_teardown();
+ return -ENOMEM;
+ }
+
+ /* wait for our kernel theard ack that it is up and running */
+ wait_for_completion(&event);
+
+ return 0;
+}
+
+
+/*
+ * edac_mc_exit()
+ * module exit/termination functioni
+ */
+static void __exit edac_mc_exit(void)
+{
+ struct completion event;
+
+ debugf0("MC: " __FILE__ ": %s()\n", __func__);
+
+ init_completion(&event);
+ bs_thread.event = &event;
+
+ /* As soon as ->run is set to NULL, the task could disappear,
+ * so we need to hold tasklist_lock until we have sent the signal
+ */
+ read_lock(&tasklist_lock);
+ bs_thread.run = NULL;
+ send_sig(SIGKILL, bs_thread.task, 1);
+ read_unlock(&tasklist_lock);
+ wait_for_completion(&event);
+
+ /* tear down the sysfs device */
+ edac_sysfs_memctrl_teardown();
+ edac_sysfs_pci_teardown();
+}
+
+
+
+
+module_init(edac_mc_init);
+module_exit(edac_mc_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Linux Networx (http://lnxi.com) Thayne Harbaugh et al\n"
+ "Based on.work by Dan Hollis et al");
+MODULE_DESCRIPTION("Core library routines for MC reporting");
+
+module_param(panic_on_ue, int, 0644);
+MODULE_PARM_DESC(panic_on_ue, "Panic on uncorrected error: 0=off 1=on");
+module_param(check_pci_parity, int, 0644);
+MODULE_PARM_DESC(check_pci_parity, "Check for PCI bus parity errors: 0=off 1=on");
+module_param(panic_on_pci_parity, int, 0644);
+MODULE_PARM_DESC(panic_on_pci_parity, "Panic on PCI Bus Parity error: 0=off 1=on");
+module_param(log_ue, int, 0644);
+MODULE_PARM_DESC(log_ue, "Log uncorrectable error to console: 0=off 1=on");
+module_param(log_ce, int, 0644);
+MODULE_PARM_DESC(log_ce, "Log correctable error to console: 0=off 1=on");
+module_param(poll_msec, int, 0644);
+MODULE_PARM_DESC(poll_msec, "Polling period in milliseconds");
+#ifdef CONFIG_EDAC_DEBUG
+module_param(edac_debug_level, int, 0644);
+MODULE_PARM_DESC(edac_debug_level, "Debug level");
+#endif
diff --git a/drivers/edac/edac_mc.h b/drivers/edac/edac_mc.h
new file mode 100644
index 000000000000..75ecf484a43a
--- /dev/null
+++ b/drivers/edac/edac_mc.h
@@ -0,0 +1,448 @@
+/*
+ * MC kernel module
+ * (C) 2003 Linux Networx (http://lnxi.com)
+ * This file may be distributed under the terms of the
+ * GNU General Public License.
+ *
+ * Written by Thayne Harbaugh
+ * Based on work by Dan Hollis <goemon at anime dot net> and others.
+ * http://www.anime.net/~goemon/linux-ecc/
+ *
+ * NMI handling support added by
+ * Dave Peterson <dsp@llnl.gov> <dave_peterson@pobox.com>
+ *
+ * $Id: edac_mc.h,v 1.4.2.10 2005/10/05 00:43:44 dsp_llnl Exp $
+ *
+ */
+
+
+#ifndef _EDAC_MC_H_
+#define _EDAC_MC_H_
+
+
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <linux/pci.h>
+#include <linux/time.h>
+#include <linux/nmi.h>
+#include <linux/rcupdate.h>
+#include <linux/completion.h>
+#include <linux/kobject.h>
+
+
+#define EDAC_MC_LABEL_LEN 31
+#define MC_PROC_NAME_MAX_LEN 7
+
+#if PAGE_SHIFT < 20
+#define PAGES_TO_MiB( pages ) ( ( pages ) >> ( 20 - PAGE_SHIFT ) )
+#else /* PAGE_SHIFT > 20 */
+#define PAGES_TO_MiB( pages ) ( ( pages ) << ( PAGE_SHIFT - 20 ) )
+#endif
+
+#ifdef CONFIG_EDAC_DEBUG
+extern int edac_debug_level;
+#define edac_debug_printk(level, fmt, args...) \
+do { if (level <= edac_debug_level) printk(KERN_DEBUG fmt, ##args); } while(0)
+#define debugf0( ... ) edac_debug_printk(0, __VA_ARGS__ )
+#define debugf1( ... ) edac_debug_printk(1, __VA_ARGS__ )
+#define debugf2( ... ) edac_debug_printk(2, __VA_ARGS__ )
+#define debugf3( ... ) edac_debug_printk(3, __VA_ARGS__ )
+#define debugf4( ... ) edac_debug_printk(4, __VA_ARGS__ )
+#else /* !CONFIG_EDAC_DEBUG */
+#define debugf0( ... )
+#define debugf1( ... )
+#define debugf2( ... )
+#define debugf3( ... )
+#define debugf4( ... )
+#endif /* !CONFIG_EDAC_DEBUG */
+
+
+#define bs_xstr(s) bs_str(s)
+#define bs_str(s) #s
+#define BS_MOD_STR bs_xstr(KBUILD_BASENAME)
+
+#define BIT(x) (1 << (x))
+
+#define PCI_VEND_DEV(vend, dev) PCI_VENDOR_ID_ ## vend, PCI_DEVICE_ID_ ## vend ## _ ## dev
+
+/* memory devices */
+enum dev_type {
+ DEV_UNKNOWN = 0,
+ DEV_X1,
+ DEV_X2,
+ DEV_X4,
+ DEV_X8,
+ DEV_X16,
+ DEV_X32, /* Do these parts exist? */
+ DEV_X64 /* Do these parts exist? */
+};
+
+#define DEV_FLAG_UNKNOWN BIT(DEV_UNKNOWN)
+#define DEV_FLAG_X1 BIT(DEV_X1)
+#define DEV_FLAG_X2 BIT(DEV_X2)
+#define DEV_FLAG_X4 BIT(DEV_X4)
+#define DEV_FLAG_X8 BIT(DEV_X8)
+#define DEV_FLAG_X16 BIT(DEV_X16)
+#define DEV_FLAG_X32 BIT(DEV_X32)
+#define DEV_FLAG_X64 BIT(DEV_X64)
+
+/* memory types */
+enum mem_type {
+ MEM_EMPTY = 0, /* Empty csrow */
+ MEM_RESERVED, /* Reserved csrow type */
+ MEM_UNKNOWN, /* Unknown csrow type */
+ MEM_FPM, /* Fast page mode */
+ MEM_EDO, /* Extended data out */
+ MEM_BEDO, /* Burst Extended data out */
+ MEM_SDR, /* Single data rate SDRAM */
+ MEM_RDR, /* Registered single data rate SDRAM */
+ MEM_DDR, /* Double data rate SDRAM */
+ MEM_RDDR, /* Registered Double data rate SDRAM */
+ MEM_RMBS /* Rambus DRAM */
+};
+
+#define MEM_FLAG_EMPTY BIT(MEM_EMPTY)
+#define MEM_FLAG_RESERVED BIT(MEM_RESERVED)
+#define MEM_FLAG_UNKNOWN BIT(MEM_UNKNOWN)
+#define MEM_FLAG_FPM BIT(MEM_FPM)
+#define MEM_FLAG_EDO BIT(MEM_EDO)
+#define MEM_FLAG_BEDO BIT(MEM_BEDO)
+#define MEM_FLAG_SDR BIT(MEM_SDR)
+#define MEM_FLAG_RDR BIT(MEM_RDR)
+#define MEM_FLAG_DDR BIT(MEM_DDR)
+#define MEM_FLAG_RDDR BIT(MEM_RDDR)
+#define MEM_FLAG_RMBS BIT(MEM_RMBS)
+
+
+/* chipset Error Detection and Correction capabilities and mode */
+enum edac_type {
+ EDAC_UNKNOWN = 0, /* Unknown if ECC is available */
+ EDAC_NONE, /* Doesnt support ECC */
+ EDAC_RESERVED, /* Reserved ECC type */
+ EDAC_PARITY, /* Detects parity errors */
+ EDAC_EC, /* Error Checking - no correction */
+ EDAC_SECDED, /* Single bit error correction, Double detection */
+ EDAC_S2ECD2ED, /* Chipkill x2 devices - do these exist? */
+ EDAC_S4ECD4ED, /* Chipkill x4 devices */
+ EDAC_S8ECD8ED, /* Chipkill x8 devices */
+ EDAC_S16ECD16ED, /* Chipkill x16 devices */
+};
+
+#define EDAC_FLAG_UNKNOWN BIT(EDAC_UNKNOWN)
+#define EDAC_FLAG_NONE BIT(EDAC_NONE)
+#define EDAC_FLAG_PARITY BIT(EDAC_PARITY)
+#define EDAC_FLAG_EC BIT(EDAC_EC)
+#define EDAC_FLAG_SECDED BIT(EDAC_SECDED)
+#define EDAC_FLAG_S2ECD2ED BIT(EDAC_S2ECD2ED)
+#define EDAC_FLAG_S4ECD4ED BIT(EDAC_S4ECD4ED)
+#define EDAC_FLAG_S8ECD8ED BIT(EDAC_S8ECD8ED)
+#define EDAC_FLAG_S16ECD16ED BIT(EDAC_S16ECD16ED)
+
+
+/* scrubbing capabilities */
+enum scrub_type {
+ SCRUB_UNKNOWN = 0, /* Unknown if scrubber is available */
+ SCRUB_NONE, /* No scrubber */
+ SCRUB_SW_PROG, /* SW progressive (sequential) scrubbing */
+ SCRUB_SW_SRC, /* Software scrub only errors */
+ SCRUB_SW_PROG_SRC, /* Progressive software scrub from an error */
+ SCRUB_SW_TUNABLE, /* Software scrub frequency is tunable */
+ SCRUB_HW_PROG, /* HW progressive (sequential) scrubbing */
+ SCRUB_HW_SRC, /* Hardware scrub only errors */
+ SCRUB_HW_PROG_SRC, /* Progressive hardware scrub from an error */
+ SCRUB_HW_TUNABLE /* Hardware scrub frequency is tunable */
+};
+
+#define SCRUB_FLAG_SW_PROG BIT(SCRUB_SW_PROG)
+#define SCRUB_FLAG_SW_SRC BIT(SCRUB_SW_SRC_CORR)
+#define SCRUB_FLAG_SW_PROG_SRC BIT(SCRUB_SW_PROG_SRC_CORR)
+#define SCRUB_FLAG_SW_TUN BIT(SCRUB_SW_SCRUB_TUNABLE)
+#define SCRUB_FLAG_HW_PROG BIT(SCRUB_HW_PROG)
+#define SCRUB_FLAG_HW_SRC BIT(SCRUB_HW_SRC_CORR)
+#define SCRUB_FLAG_HW_PROG_SRC BIT(SCRUB_HW_PROG_SRC_CORR)
+#define SCRUB_FLAG_HW_TUN BIT(SCRUB_HW_TUNABLE)
+
+enum mci_sysfs_status {
+ MCI_SYSFS_INACTIVE = 0, /* sysfs entries NOT registered */
+ MCI_SYSFS_ACTIVE /* sysfs entries ARE registered */
+};
+
+/* FIXME - should have notify capabilities: NMI, LOG, PROC, etc */
+
+/*
+ * There are several things to be aware of that aren't at all obvious:
+ *
+ *
+ * SOCKETS, SOCKET SETS, BANKS, ROWS, CHIP-SELECT ROWS, CHANNELS, etc..
+ *
+ * These are some of the many terms that are thrown about that don't always
+ * mean what people think they mean (Inconceivable!). In the interest of
+ * creating a common ground for discussion, terms and their definitions
+ * will be established.
+ *
+ * Memory devices: The individual chip on a memory stick. These devices
+ * commonly output 4 and 8 bits each. Grouping several
+ * of these in parallel provides 64 bits which is common
+ * for a memory stick.
+ *
+ * Memory Stick: A printed circuit board that agregates multiple
+ * memory devices in parallel. This is the atomic
+ * memory component that is purchaseable by Joe consumer
+ * and loaded into a memory socket.
+ *
+ * Socket: A physical connector on the motherboard that accepts
+ * a single memory stick.
+ *
+ * Channel: Set of memory devices on a memory stick that must be
+ * grouped in parallel with one or more additional
+ * channels from other memory sticks. This parallel
+ * grouping of the output from multiple channels are
+ * necessary for the smallest granularity of memory access.
+ * Some memory controllers are capable of single channel -
+ * which means that memory sticks can be loaded
+ * individually. Other memory controllers are only
+ * capable of dual channel - which means that memory
+ * sticks must be loaded as pairs (see "socket set").
+ *
+ * Chip-select row: All of the memory devices that are selected together.
+ * for a single, minimum grain of memory access.
+ * This selects all of the parallel memory devices across
+ * all of the parallel channels. Common chip-select rows
+ * for single channel are 64 bits, for dual channel 128
+ * bits.
+ *
+ * Single-Ranked stick: A Single-ranked stick has 1 chip-select row of memmory.
+ * Motherboards commonly drive two chip-select pins to
+ * a memory stick. A single-ranked stick, will occupy
+ * only one of those rows. The other will be unused.
+ *
+ * Double-Ranked stick: A double-ranked stick has two chip-select rows which
+ * access different sets of memory devices. The two
+ * rows cannot be accessed concurrently.
+ *
+ * Double-sided stick: DEPRECATED TERM, see Double-Ranked stick.
+ * A double-sided stick has two chip-select rows which
+ * access different sets of memory devices. The two
+ * rows cannot be accessed concurrently. "Double-sided"
+ * is irrespective of the memory devices being mounted
+ * on both sides of the memory stick.
+ *
+ * Socket set: All of the memory sticks that are required for for
+ * a single memory access or all of the memory sticks
+ * spanned by a chip-select row. A single socket set
+ * has two chip-select rows and if double-sided sticks
+ * are used these will occupy those chip-select rows.
+ *
+ * Bank: This term is avoided because it is unclear when
+ * needing to distinguish between chip-select rows and
+ * socket sets.
+ *
+ * Controller pages:
+ *
+ * Physical pages:
+ *
+ * Virtual pages:
+ *
+ *
+ * STRUCTURE ORGANIZATION AND CHOICES
+ *
+ *
+ *
+ * PS - I enjoyed writing all that about as much as you enjoyed reading it.
+ */
+
+
+struct channel_info {
+ int chan_idx; /* channel index */
+ u32 ce_count; /* Correctable Errors for this CHANNEL */
+ char label[EDAC_MC_LABEL_LEN + 1]; /* DIMM label on motherboard */
+ struct csrow_info *csrow; /* the parent */
+};
+
+
+struct csrow_info {
+ unsigned long first_page; /* first page number in dimm */
+ unsigned long last_page; /* last page number in dimm */
+ unsigned long page_mask; /* used for interleaving -
+ 0UL for non intlv */
+ u32 nr_pages; /* number of pages in csrow */
+ u32 grain; /* granularity of reported error in bytes */
+ int csrow_idx; /* the chip-select row */
+ enum dev_type dtype; /* memory device type */
+ u32 ue_count; /* Uncorrectable Errors for this csrow */
+ u32 ce_count; /* Correctable Errors for this csrow */
+ enum mem_type mtype; /* memory csrow type */
+ enum edac_type edac_mode; /* EDAC mode for this csrow */
+ struct mem_ctl_info *mci; /* the parent */
+
+ struct kobject kobj; /* sysfs kobject for this csrow */
+
+ /* FIXME the number of CHANNELs might need to become dynamic */
+ u32 nr_channels;
+ struct channel_info *channels;
+};
+
+
+struct mem_ctl_info {
+ struct list_head link; /* for global list of mem_ctl_info structs */
+ unsigned long mtype_cap; /* memory types supported by mc */
+ unsigned long edac_ctl_cap; /* Mem controller EDAC capabilities */
+ unsigned long edac_cap; /* configuration capabilities - this is
+ closely related to edac_ctl_cap. The
+ difference is that the controller
+ may be capable of s4ecd4ed which would
+ be listed in edac_ctl_cap, but if
+ channels aren't capable of s4ecd4ed then the
+ edac_cap would not have that capability. */
+ unsigned long scrub_cap; /* chipset scrub capabilities */
+ enum scrub_type scrub_mode; /* current scrub mode */
+
+ enum mci_sysfs_status sysfs_active; /* status of sysfs */
+
+ /* pointer to edac checking routine */
+ void (*edac_check) (struct mem_ctl_info * mci);
+ /*
+ * Remaps memory pages: controller pages to physical pages.
+ * For most MC's, this will be NULL.
+ */
+ /* FIXME - why not send the phys page to begin with? */
+ unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
+ unsigned long page);
+ int mc_idx;
+ int nr_csrows;
+ struct csrow_info *csrows;
+ /*
+ * FIXME - what about controllers on other busses? - IDs must be
+ * unique. pdev pointer should be sufficiently unique, but
+ * BUS:SLOT.FUNC numbers may not be unique.
+ */
+ struct pci_dev *pdev;
+ const char *mod_name;
+ const char *mod_ver;
+ const char *ctl_name;
+ char proc_name[MC_PROC_NAME_MAX_LEN + 1];
+ void *pvt_info;
+ u32 ue_noinfo_count; /* Uncorrectable Errors w/o info */
+ u32 ce_noinfo_count; /* Correctable Errors w/o info */
+ u32 ue_count; /* Total Uncorrectable Errors for this MC */
+ u32 ce_count; /* Total Correctable Errors for this MC */
+ unsigned long start_time; /* mci load start time (in jiffies) */
+
+ /* this stuff is for safe removal of mc devices from global list while
+ * NMI handlers may be traversing list
+ */
+ struct rcu_head rcu;
+ struct completion complete;
+
+ /* edac sysfs device control */
+ struct kobject edac_mci_kobj;
+};
+
+
+
+/* write all or some bits in a byte-register*/
+static inline void pci_write_bits8(struct pci_dev *pdev, int offset,
+ u8 value, u8 mask)
+{
+ if (mask != 0xff) {
+ u8 buf;
+ pci_read_config_byte(pdev, offset, &buf);
+ value &= mask;
+ buf &= ~mask;
+ value |= buf;
+ }
+ pci_write_config_byte(pdev, offset, value);
+}
+
+
+/* write all or some bits in a word-register*/
+static inline void pci_write_bits16(struct pci_dev *pdev, int offset,
+ u16 value, u16 mask)
+{
+ if (mask != 0xffff) {
+ u16 buf;
+ pci_read_config_word(pdev, offset, &buf);
+ value &= mask;
+ buf &= ~mask;
+ value |= buf;
+ }
+ pci_write_config_word(pdev, offset, value);
+}
+
+
+/* write all or some bits in a dword-register*/
+static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
+ u32 value, u32 mask)
+{
+ if (mask != 0xffff) {
+ u32 buf;
+ pci_read_config_dword(pdev, offset, &buf);
+ value &= mask;
+ buf &= ~mask;
+ value |= buf;
+ }
+ pci_write_config_dword(pdev, offset, value);
+}
+
+
+#ifdef CONFIG_EDAC_DEBUG
+void edac_mc_dump_channel(struct channel_info *chan);
+void edac_mc_dump_mci(struct mem_ctl_info *mci);
+void edac_mc_dump_csrow(struct csrow_info *csrow);
+#endif /* CONFIG_EDAC_DEBUG */
+
+extern int edac_mc_add_mc(struct mem_ctl_info *mci);
+extern int edac_mc_del_mc(struct mem_ctl_info *mci);
+
+extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
+ unsigned long page);
+
+extern struct mem_ctl_info *edac_mc_find_mci_by_pdev(struct pci_dev
+ *pdev);
+
+extern void edac_mc_scrub_block(unsigned long page,
+ unsigned long offset, u32 size);
+
+/*
+ * The no info errors are used when error overflows are reported.
+ * There are a limited number of error logging registers that can
+ * be exausted. When all registers are exhausted and an additional
+ * error occurs then an error overflow register records that an
+ * error occured and the type of error, but doesn't have any
+ * further information. The ce/ue versions make for cleaner
+ * reporting logic and function interface - reduces conditional
+ * statement clutter and extra function arguments.
+ */
+extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
+ unsigned long page_frame_number,
+ unsigned long offset_in_page,
+ unsigned long syndrome,
+ int row, int channel, const char *msg);
+
+extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+ const char *msg);
+
+extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
+ unsigned long page_frame_number,
+ unsigned long offset_in_page,
+ int row, const char *msg);
+
+extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+ const char *msg);
+
+/*
+ * This kmalloc's and initializes all the structures.
+ * Can't be used if all structures don't have the same lifetime.
+ */
+extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt,
+ unsigned nr_csrows, unsigned nr_chans);
+
+/* Free an mc previously allocated by edac_mc_alloc() */
+extern void edac_mc_free(struct mem_ctl_info *mci);
+
+
+#endif /* _EDAC_MC_H_ */
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index bfb7ae02e379..52596e75f9c2 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -253,7 +253,7 @@ static struct pci_driver i82860_driver = {
.id_table = i82860_pci_tbl,
};
-int __init i82860_init(void)
+static int __init i82860_init(void)
{
int pci_rc;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 79d14dfbcbdc..009c08fe5d69 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -483,7 +483,7 @@ static struct pci_driver i82875p_driver = {
};
-int __init i82875p_init(void)
+static int __init i82875p_init(void)
{
int pci_rc;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index b6399a542560..e90892831b90 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -381,7 +381,7 @@ static struct pci_driver r82600_driver = {
};
-int __init r82600_init(void)
+static int __init r82600_init(void)
{
return pci_register_driver(&r82600_driver);
}
diff --git a/include/asm-i386/atomic.h b/include/asm-i386/atomic.h
index e2c00c95a5e1..de649d3aa2d4 100644
--- a/include/asm-i386/atomic.h
+++ b/include/asm-i386/atomic.h
@@ -255,17 +255,5 @@ __asm__ __volatile__(LOCK "orl %0,%1" \
#define smp_mb__before_atomic_inc() barrier()
#define smp_mb__after_atomic_inc() barrier()
-/* ECC atomic, DMA, SMP and interrupt safe scrub function */
-
-static __inline__ void atomic_scrub(unsigned long *virt_addr, u32 size)
-{
- u32 i;
- for (i = 0; i < size / 4; i++, virt_addr++)
- /* Very carefully read and write to memory atomically
- * so we are interrupt, DMA and SMP safe.
- */
- __asm__ __volatile__("lock; addl $0, %0"::"m"(*virt_addr));
-}
-
#include <asm-generic/atomic.h>
#endif
diff --git a/include/asm-i386/edac.h b/include/asm-i386/edac.h
new file mode 100644
index 000000000000..3e7dd0ab68ce
--- /dev/null
+++ b/include/asm-i386/edac.h
@@ -0,0 +1,18 @@
+#ifndef ASM_EDAC_H
+#define ASM_EDAC_H
+
+/* ECC atomic, DMA, SMP and interrupt safe scrub function */
+
+static __inline__ void atomic_scrub(void *va, u32 size)
+{
+ unsigned long *virt_addr = va;
+ u32 i;
+
+ for (i = 0; i < size / 4; i++, virt_addr++)
+ /* Very carefully read and write to memory atomically
+ * so we are interrupt, DMA and SMP safe.
+ */
+ __asm__ __volatile__("lock; addl $0, %0"::"m"(*virt_addr));
+}
+
+#endif
diff --git a/include/asm-x86_64/atomic.h b/include/asm-x86_64/atomic.h
index 4048508c4f40..4b5cd553e772 100644
--- a/include/asm-x86_64/atomic.h
+++ b/include/asm-x86_64/atomic.h
@@ -426,17 +426,5 @@ __asm__ __volatile__(LOCK "orl %0,%1" \
#define smp_mb__before_atomic_inc() barrier()
#define smp_mb__after_atomic_inc() barrier()
-/* ECC atomic, DMA, SMP and interrupt safe scrub function */
-
-static __inline__ void atomic_scrub(u32 *virt_addr, u32 size)
-{
- u32 i;
- for (i = 0; i < size / 4; i++, virt_addr++)
- /* Very carefully read and write to memory atomically
- * so we are interrupt, DMA and SMP safe.
- */
- __asm__ __volatile__("lock; addl $0, %0"::"m"(*virt_addr));
-}
-
#include <asm-generic/atomic.h>
#endif
diff --git a/include/asm-x86_64/edac.h b/include/asm-x86_64/edac.h
new file mode 100644
index 000000000000..cad1cd42b4ee
--- /dev/null
+++ b/include/asm-x86_64/edac.h
@@ -0,0 +1,18 @@
+#ifndef ASM_EDAC_H
+#define ASM_EDAC_H
+
+/* ECC atomic, DMA, SMP and interrupt safe scrub function */
+
+static __inline__ void atomic_scrub(void *va, u32 size)
+{
+ unsigned int *virt_addr = va;
+ u32 i;
+
+ for (i = 0; i < size / 4; i++, virt_addr++)
+ /* Very carefully read and write to memory atomically
+ * so we are interrupt, DMA and SMP safe.
+ */
+ __asm__ __volatile__("lock; addl $0, %0"::"m"(*virt_addr));
+}
+
+#endif
OpenPOWER on IntegriCloud