summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i7core_edac: Use Device 3 function 2 to report errors with RDIMM'sMauro Carvalho Chehab2010-05-101-30/+178
| | | | | | | | | | | | | | | | | Nehalem and upper chipsets provide an special device that has corrected memory error counters detected with registered dimms. This device is only seen if there are registered memories plugged. After this patch, on a machine fully equiped with RDIMM's, it will use the Device 3 function 2 to count corrected errors instead on relying at mcelog. For unregistered DIMMs, it will keep the old behavior, counting errors via mcelog. This patch were developed together with Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Fix ecc enable shiftKeith Mannthey2010-05-101-1/+1
| | | | | | | | | | | | From: Keith Mannthey <kmannth@us.ibm.com> Simple correction to a shift value. ECC_ENABLED is bit 4 of MC_STATUS, Dev 3 Fun 0 Offset 0x4c This correctly identifies the state of the ECC at the machine. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Print an error message if pci register failsMauro Carvalho Chehab2010-05-101-1/+7
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: CodingSyle fixes/cleanupsMauro Carvalho Chehab2010-05-101-27/+23
| | | | | | No functional changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* Documentation/edac.txt: Add Nehalem specific EDAC characteristicsMauro Carvalho Chehab2010-05-101-0/+110
| | | | | | | As Nehalem has a different binding to EDAC API, and its own different error injection code, documents it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: fix error injectionMauro Carvalho Chehab2010-05-101-15/+12
| | | | | | | | | | | There were two stupid error injection bugs introduced by wrong cut-and-paste: one at socket store, and another at the error inject register. The last one were causing the code to not work at all. While here, adds debug messages to allow seeing what registers are being set while sending error injection. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: fix error codes for sysfs error injection interfaceMauro Carvalho Chehab2010-05-101-4/+4
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: some fixes at error injection codeMauro Carvalho Chehab2010-05-101-53/+51
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Some cleanups at displayed infoMauro Carvalho Chehab2010-05-101-12/+9
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: remove some uneeded noisy debug messagesMauro Carvalho Chehab2010-05-101-4/+0
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: add socket info at the debug msgMauro Carvalho Chehab2010-05-101-2/+2
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: better document i7core_get_active_channels()Mauro Carvalho Chehab2010-05-101-1/+17
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: fix get_devices routine for Xeon55xxMauro Carvalho Chehab2010-05-101-78/+108
| | | | | | | | | | | i7core_get_devices() were preparet to get just the first found device of each type. Due to that, on Xeon 55xx, only socket 1 were retrived. Rework i7core_get_devices() to clean it and to properly support Xeon 55xx. While here, fix a small typo. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: enrich error information based on memory transaction typeMauro Carvalho Chehab2010-05-101-5/+27
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: check if the memory error is fatal or non-fatalMauro Carvalho Chehab2010-05-101-3/+13
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core: fix probing on Xeon55xxMauro Carvalho Chehab2010-05-102-3/+21
| | | | | | | | | | | | | Xeon55xx fails to probe with this error message: EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1660: MC: drivers/edac/i7core_edac.c: i7core_init() EDAC i7core: Device not found: dev 00:00.0 PCI ID 8086:2c41 i7core_edac: probe of 0000:00:14.0 failed with error -22 This is due to the fact that, on Xeon35xx (and i7core), device 00.0 has PCI ID 8086:2c40. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: some fixes at memory error parserMauro Carvalho Chehab2010-05-101-8/+14
| | | | | | | | | | | m->bank is not related to the memory bank but, instead, to the MCA Error register bank. Fix it accordingly. While here, improves the comments for Nehalem bank. A later fix is needed, in order to get bank/rank information from MCA error log. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: decode mcelog error and send it via edac interfaceMauro Carvalho Chehab2010-05-101-22/+70
| | | | | | | | | | Enriches mcelog error by using the encoded information at MCE status and misc registers (IA32_MCx_STATUS, IA32_MCx_MISC). Some fixes are still needed here, in order to properly fill the EDAC fields. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: maps all sockets as if ther are one MC controllerMauro Carvalho Chehab2010-05-101-6/+7
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: add support for more than one MC socketMauro Carvalho Chehab2010-05-101-113/+213
| | | | | | | | | | | | | | Some Nehalem architectures have more than one MC socket. Socket 0 is located at bus 255. Currently, it is using up to 2 sockets, but increasing it to a larger number is just a matter of increasing MAX_SOCKETS definition. This seems to be required for properly support of Xeon 55xx. Still needs testing with Xeon 55xx. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add a code to probe Xeon 55xx busMauro Carvalho Chehab2010-05-104-4/+16
| | | | | | | | | | | | | | | | | This code changes the detection procedure of i7core_edac. Instead of directly probing for MC registers, it probes for another register found on Nehalem. If found, it tries to pick the first MC PCI BUS. This should work fine with Xeon 35xx, but, on Xeon 55xx, this is at bus 254 and 255 that are not properly detected by the non-legacy PCI methods. The new detection code scans specifically at buses 254 and 255 for the Xeon 55xx devices. This code has not tested yet. After working, a change at the code will be needed, since the i7core is not yet ready for working with 2 sets of MC. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* pci: Add a probing code that seeks for an specific busAristeu Rozanski2010-05-102-17/+27
| | | | | | | | | This patch adds a probing code that seeks for an specific pci bus. It still needs testing, but it is hoped that this will help to identify the memory controller with Xeon 55xx series. Signed-off-by: Aristeu Sergio <arozansk@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Adds write unlock to MC registersMauro Carvalho Chehab2010-05-102-3/+28
| | | | | | | | | | | | | | | | | The public Intel Xeon 5500 volume 2 datasheet describes, on page 53, session 2.6.7 a register that can lock/unlock Memory Controller the configuration register, called MC_CFG_CONTROL. Adds support for it in the hope that software error injection would work. With my tests with Xeon 35xx, there's still something missing. With a program that does sequencial bit writes at dev 0.0, sometimes, it produces error injection, after unblocking the MC_CFG_CONTROL (and, sometimes, it just locks my testing machine). I'll try later to discover by trial and error what's the register that solves this issue on Xeon 35xx. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add edac_mce glueMauro Carvalho Chehab2010-05-102-6/+116
| | | | | | | | | | | | | | Adds a glue code to allow i7core to work with mcelog. With the glue, i7core registers itself on edac_mce. At mce, when an error is detected, it calls all registered drivers (in this case, i7core), for EDAC error handling. TODO: It currently just prints the MCE error log using about the same format as mce panic messages. The error message should be enhanced with mcelog userspace info and converted into the proper EDAC format, to feed the EDAC error counts. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* edac/Kconfig: edac_mce can't be moduleMauro Carvalho Chehab2010-05-101-1/+1
| | | | | | | Since mcelog is bool, edac_mce glue should also be bool, or otherwise will not work. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* edac_mce: Add an interface driver to report mce errors via edacMauro Carvalho Chehab2010-05-105-1/+107
| | | | | | | | edac_mce module is an interface module that gets mcelog data and forwards to any registered edac module that expects to receive data via mce. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: CodingStyle fixesMauro Carvalho Chehab2010-05-101-27/+32
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: fill csrows edac sysfs infoMauro Carvalho Chehab2010-05-101-16/+50
| | | | | | | csrows is still fake, since we can't identify its representation with Nehalem registers. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Memory info fixes and preparation for properly filling cswrow dataMauro Carvalho Chehab2010-05-101-9/+19
| | | | | | | | | | | | | | | | | | | | | | | | Now, memory size is properly displayed: EDAC i7core: DOD Max limits: DIMMS: 2, 1-ranked, 8-banked EDAC i7core: DOD Max rows x colums = 0x4000 x 0x400 EDAC i7core: Memory channel configuration: EDAC i7core: Ch0 phy rd0, wr0 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: dimm 1 (0x00001288) 1024 Mb offset: 4, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: Ch1 phy rd1, wr1 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: Ch2 phy rd3, wr3 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 Still, as the way to retrieve csrows info is not known, it does a mapping of what's available to csrows basic unit at edac core. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Get more info about the memory DIMMsMauro Carvalho Chehab2010-05-101-63/+107
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add more information about each active dimmMauro Carvalho Chehab2010-05-101-13/+30
| | | | | | Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Improve error handlingMauro Carvalho Chehab2010-05-101-11/+21
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Properly fill struct csrow_infoMauro Carvalho Chehab2010-05-101-10/+38
| | | | | | Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add additional tests for error detectionMauro Carvalho Chehab2010-05-101-60/+139
| | | | | | Properly check the number of channels and improve probing error detection Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add a memory check routine, based on device 3 function 4Mauro Carvalho Chehab2010-05-101-7/+108
| | | | | | | | | | | | | | | | This function appears only on Xeon 5500 datasheet. Yet, testing with a Xeon 3503 showed that this is also implemented on other Nehalem processors. At the first read, MC_TEST_ERR_RCV1 and MC_TEST_ERR_RCV0 can contain any value. Modify CE error logic to update the error count only after the second read. An alternative approach would be to do a write at rcv0 and rcv1 registers, but it seemed better to keep they untouched, since BIOS might eventually assume that they are exclusive for their usage. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: need mci->edac_check, otherwise module removal doesn't workMauro Carvalho Chehab2010-05-101-4/+16
| | | | | | | There are some locking troubles with edac_core: if you don't declare an edac_check, module may suffer from soft lock. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: A few fixes at error injection codeMauro Carvalho Chehab2010-05-101-15/+55
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Show read/write virtual/physical channel associationMauro Carvalho Chehab2010-05-101-6/+27
| | | | Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Registers all supported MC functionsMauro Carvalho Chehab2010-05-101-86/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, it will try to register on all supported Memory Controller functions. It should be noticed that dev3, function 2 is present only on chips with Registered DIMM's, according to the datasheet. So, the driver doesn't return -ENODEV is all functions but this one were successfully registered and enabled: EDAC i7core: Registered device 8086:2c18 fn=3 0 EDAC i7core: Registered device 8086:2c19 fn=3 1 EDAC i7core: Device not found: PCI ID 8086:2c1a (dev 3, func 2) EDAC i7core: Registered device 8086:2c1c fn=3 4 EDAC i7core: Registered device 8086:2c20 fn=4 0 EDAC i7core: Registered device 8086:2c21 fn=4 1 EDAC i7core: Registered device 8086:2c22 fn=4 2 EDAC i7core: Registered device 8086:2c23 fn=4 3 EDAC i7core: Registered device 8086:2c28 fn=5 0 EDAC i7core: Registered device 8086:2c29 fn=5 1 EDAC i7core: Registered device 8086:2c2a fn=5 2 EDAC i7core: Registered device 8086:2c2b fn=5 3 EDAC i7core: Registered device 8086:2c30 fn=6 0 EDAC i7core: Registered device 8086:2c31 fn=6 1 EDAC i7core: Registered device 8086:2c32 fn=6 2 EDAC i7core: Registered device 8086:2c33 fn=6 3 EDAC i7core: Driver loaded. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add more status functions to EDAC driverMauro Carvalho Chehab2010-05-101-19/+95
| | | | | | | This patch were co-authored with Aristeu Rozanski. Signed-off-by: Aristeu Sergio <arozansk@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add error insertion code for NehalemMauro Carvalho Chehab2010-05-101-8/+419
| | | | | | | | | Implements set_inject_error() with the low-level code needed to inject memory errors at Nehalem, and adds some sysfs nodes to allow error injection The next patch will add an API for error injection. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* i7core_edac: Add an EDAC memory controller driver for Nehalem chipsetsMauro Carvalho Chehab2010-05-104-0/+486
| | | | | | | | | | | This driver is meant to support i7 core/i7core extreme desktop processors and Xeon 35xx/55xx series with integrated memory controller. It is likely that it can be expanded in the future to work with other processor series based at the same Memory Controller design. For now, it has just a few MCH status reads. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* Linux 2.6.34-rc6v2.6.34-rc6Linus Torvalds2010-04-291-1/+1
|
* Merge branch 'for_linus' of ↵Linus Torvalds2010-04-291-5/+0
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb: don't needlessly skip PAGE_USER test for Fsl booke
| * kgdb: don't needlessly skip PAGE_USER test for Fsl bookeWufei2010-04-291-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bypassing of this test is a leftover from 2.4 vintage kernels, and is no longer appropriate, or even used by KGDB. Currently KGDB uses probe_kernel_write() for all access to memory via the KGDB core, so it can simply be deleted. This fixes CVE-2010-1446. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Wufei <fei.wu@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
* | Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds2010-04-296-9/+120
|\ \ | | | | | | | | | | | | * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: add a shrinker to background inode reclaim
| * | xfs: add a shrinker to background inode reclaimDave Chinner2010-04-296-9/+120
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | On low memory boxes or those with highmem, kernel can OOM before the background reclaims inodes via xfssyncd. Add a shrinker to run inode reclaim so that it inode reclaim is expedited when memory is low. This is more complex than it needs to be because the VM folk don't want a context added to the shrinker infrastructure. Hence we need to add a global list of XFS mount structures so the shrinker can traverse them. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-04-291-0/+1
|\ \ | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: exofs: Fix "add bdi backing to mount session" fall out fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
| * | exofs: Fix "add bdi backing to mount session" fall outBoaz Harrosh2010-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch: add bdi backing to mount session (b3d0ab7e60d1865bb6f6a79a77aaba22f2543236) Has a bug in the placement of the bdi member at struct exofs_sb_info. The layout member must be kept last. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCKJens Axboe2010-04-291-0/+1
| |/ | | | | | | | | | | | | | | | | When CONFIG_BLOCK is set, it ends up getting backing-dev.h included. But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is include <linux/backing-dev.h> directly from the file it's used from, so do that. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
OpenPOWER on IntegriCloud