summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel
Commit message (Collapse)AuthorAgeFilesLines
* x86, mce: check early in exception handler if panic is neededAndi Kleen2009-06-031-13/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | The exception handler should behave differently if the exception is fatal versus one that can be returned from. In the first case it should never clear any registers because these need to be preserved for logging after the next boot. Otherwise it should clear them on each CPU step by step so that other CPUs sharing the same bank don't see duplicate events. Otherwise we risk reporting events multiple times on any CPUs which have shared machine check banks, which is a common problem on Intel Nehalem which has both SMT (two CPU threads sharing banks) and shared machine check banks in the uncore. Determine early in a special pass if any event requires a panic. This uses the mce_severity() function added earlier. This is needed for the next patch. Also fixes a problem together with an earlier patch that corrected events weren't logged on a fatal MCE. [ Impact: Feature ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: add table driven machine check gradingAndi Kleen2009-06-033-0/+72
| | | | | | | | | | | | | | | | | | | The machine check grading (as in deciding what should be done for a given register value) has to be done multiple times soon and it's also getting more complicated. So it makes sense to consolidate it into a single function. To get smaller and more straight forward and possibly more extensible code I opted towards a new table driven method. The various rules are put into a table when is then executed by a very simple interpreter. The grading engine is in a new file mce-severity.c. I also added a private include file mce-internal.h, because mce.h is already a bit too cluttered. This is dead code right now, but will be used in followon patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: remove TSC print heuristicAndi Kleen2009-06-031-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously mce_panic used a simple heuristic to avoid printing old so far unreported machine check events on a mce panic. This worked by comparing the TSC value at the start of the machine check handler with the event time stamp and only printing newer ones. This has a couple of issues, in particular on systems where the TSC is not fully synchronized between CPUs it could lose events or print old ones. It is also problematic with full system synchronization as it is added by the next patch. Remove the TSC heuristic and instead replace it with a simple heuristic to print corrected errors first and after that uncorrected errors and finally the worst machine check as determined by the machine check handler. This simplifies the code because there is no need to pass the original TSC value around. Contains fixes from Ying Huang [ Impact: bug fix, cleanup ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Ying Huang <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: log corrected errors when panicingAndi Kleen2009-06-031-2/+2
| | | | | | | | | | | | | Normally the machine check handler ignores corrected errors and leaves them to machine_check_poll(). But when panicing mcp won't run, so log all errors. Note: this can still miss some cases until the "early no way out" patch later is applied too. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: extend struct mce user interface with more information.Andi Kleen2009-06-031-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Experience has shown that struct mce which is used to pass an machine check to the user space daemon currently a few limitations. Also some data which is useful to print at panic level is also missing. This patch addresses most of them. The same information is also printed out together with mce panic. struct mce can be painlessly extended in a compatible way, the mcelog user space code just ignores additional fields with a warning. - It doesn't provide a wall time timestamp. There have been a few complaints about that. Fix that by adding a 64bit time_t - It doesn't provide the exact CPU identification. This makes it awkward for mcelog to decode the event correctly, especially when there are variations in the supported MCE codes on different CPU models or when mcelog is running on a different host after a panic. Previously the administrator had to specify the correct CPU when mcelog ran on a different host, but with the more variation in machine checks now it's better to auto detect that. It's also useful for more detailed analysis of CPU events. Pass CPUID 1.EAX and the cpu vendor (as encoded in processor.h) instead. - Socket ID and initial APIC ID are useful to report because they allow to identify the failing CPU in some (not all) cases. This is also especially useful for the panic situation. This addresses one of the complaints from Thomas Gleixner earlier. - The MCG capabilities MSR needs to be reported for some advanced error processing in mcelog Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: support more than 256 CPUs in struct mceAndi Kleen2009-06-032-7/+7
| | | | | | | | | | The old struct mce had a limitation to 256 CPUs. But x86 Linux supports more than that now with x2apic. Add a new field extcpu to report the extended number. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: store record length into memory struct mce anchorAndi Kleen2009-06-031-2/+3
| | | | | | | | | | This makes it easier for tools who want to extract the mcelog out of crash images or memory dumps to adapt to changing struct mce size. The length field replaces padding, so it's fully compatible. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: add MCE poll count to /proc/interruptsAndi Kleen2009-06-032-0/+8
| | | | | | | | | | | | Keep a count of the machine check polls (or CMCI events) in /proc/interrupts. Andi needs this for debugging, but it's also useful in general to see what's going in by the kernel. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mce: add machine check exception count in /proc/interruptsAndi Kleen2009-06-032-0/+14
| | | | | | | | | | | | Useful for debugging, but it's also good general policy to have a counter for all special interrupts there. This makes it easier to diagnose where a CPU is spending its time. [ Impact: feature, debugging tool ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* Merge branch 'irq/numa' into x86/mce3H. Peter Anvin2009-06-0122-982/+871
|\ | | | | | | | | | | | | | | | | | | Merge reason: arch/x86/kernel/irqinit_{32,64}.c unified in irq/numa and modified in x86/mce3; this merge resolves the conflict. Conflicts: arch/x86/kernel/irqinit.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * Merge branch 'x86/cpufeature' into irq/numaIngo Molnar2009-06-011-5/+12
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: irq/numa didnt build because this commit: 2759c32: x86: don't call read_apic_id if !cpu_has_apic Had a dependency on x86/cpufeature changes. Pull in that (small) branch to fix the dependency. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * x86: clean up and fix setup_clear/force_cpu_cap handlingYinghai Lu2009-05-111-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | setup_force_cpu_cap() only have one user (Xen guest code), but it should not reuse cleared_cpu_cpus, otherwise it will have problems on SMP. Need to have a separate cpu_cpus_set array too, for forced-on flags, beyond the forced-off flags. Also need to setup handling before all cpus caps are combined. [ Impact: fix the forced-set CPU feature flag logic ] Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'linus' into irq/numaIngo Molnar2009-06-0113-27/+65
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/mips/sibyte/bcm1480/irq.c arch/mips/sibyte/sb1250/irq.c Merge reason: we gathered a few conflicts plus update to latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | acpi-cpufreq: fix printk typo and indentationJoe Perches2009-05-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
| * | | x86, io-apic: Don't mark pin_programmed earlyYinghai Lu2009-05-191-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Peter bisected that: | commit b9c61b70075c87a8612624736faf4a2de5b1ed30 | Date: Wed May 6 10:10:06 2009 -0700 | | x86/pci: update pirq_enable_irq() to setup io apic routing | | So we can set io apic routing only when enabling the device irq. wrecked his opteron box, ata1 interrupts fail to get through. ata1 is using irq 11: [ 1.451839] sata_svw 0000:01:0e.0: version 2.3 [ 1.456333] sata_svw 0000:01:0e.0: PCI INT A -> GSI 11 (level, low) -> IRQ 11 [ 1.463639] scsi0 : sata_svw [ 1.466949] scsi1 : sata_svw [ 1.470022] scsi2 : sata_svw [ 1.473090] scsi3 : sata_svw [ 1.476112] ata1: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe000 irq 11 [ 1.483490] ata2: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe100 irq 11 [ 1.490870] ata3: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe200 irq 11 [ 1.498247] ata4: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe300 irq 11 that pin is overlapped with pin with legacy ones. We should not set bits in pin_programmed here, so that those bit could be set later via io_apic_set_pci_routing(). [ Impact: fix boot hang on certain systems ] Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org> Tested-by: Peter Zijlstra <peterz@infradead.org> Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <4A119990.9020606@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, irq: don't call mp_config_acpi_gsi() if update_mptable is not enabledYinghai Lu2009-05-182-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Len expressed concern that the update_mptable feature has side-effects on the ACPI code. Make it sure explicitly that the code only ever gets called if the (default disabled) update_mptable boot quirk option is disabled. [ Impact: isolate the update_mptable feature from ACPI code more ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A0DC832.5090200@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, irq: update_mptable needs pci_routeirqYinghai Lu2009-05-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To get all device irq routing and to save them. This is basically an implicit pci=routeirq enablement if (and on if) the update_mptable boot option (which is off by default) has been specified. [ Impact: extend the update_mptable boot opion's scope ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <4A0DB7B4.4060702@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: don't call read_apic_id if !cpu_has_apicYinghai Lu2009-05-184-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | should not call that if apic is disabled. [ Impact: fix crash on certain UP configs ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <4A09CCBB.2000306@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, apic: introduce io_apic_irq_attrYinghai Lu2009-05-182-28/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | according to Ingo, io_apic irq-setup related functions have too many parameters with a repetitive signature. So reduce related funcs to get less params by passing a pointer to a newly defined io_apic_irq_attr structure. v2: io_apic_irq ==> irq_attr triggering ==> trigger v3: add set_io_apic_irq_attr [ Impact: cleanup ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A08ACD3.2070401@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: read apic ID in the !acpi_lapic caseYinghai Lu2009-05-122-24/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ed found that on 32-bit, boot_cpu_physical_apicid is not read right, when the mptable is broken. Interestingly, actually three paths use/set it: 1. acpi: at that time that is already read from reg 2. mptable: only read from mptable 3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit so we could read the apic id for the 2/3 path. We trust the hardware register more than we trust a BIOS data structure (the mptable). We can also avoid the double set_fixmap() when acpi_lapic is used, and also need to move cpu_has_apic earlier and call apic_disable(). Also when need to update the apic id, we'd better read and set the apic version as well - so that quirks are applied precisely. v2: make path 3 with 64bit, use -1 as apic id, so could read it later. v3: fix whitespace problem pointed out by Ed Swierk v5: fix boot crash [ Impact: get correct apic id for bsp other than acpi path ] Reported-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <49FC85A9.2070702@kernel.org> [ v4: sanity-check in the ACPI case too ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | Merge branch 'x86/apic' into irq/numaIngo Molnar2009-05-1213-64/+109
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: both topics modify the APIC code but were able to do it in parallel so far. An upcoming patch generates a conflict so merge them to avoid the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: apic: Fixmap apic address even if apic disabledCyrill Gorcunov2009-05-111-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case if apic were disabled by boot option we still need read_apic operation. So fixmap a fake apic area if needed. [ Impact: fix boot crash ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: eswierk@aristanetworks.com LKML-Reference: <20090511134140.GH4624@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: display extended apic registers with print_local_APIC and cpu_debug codeAndreas Herrmann2009-05-113-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both print_local_APIC (used when apic=debug kernel param is set) and cpu_debug code missed support for some extended APIC registers that I'd like to see. This adds support to show: - extended APIC feature register - extended APIC control register - extended LVT registers [ Impact: print more debug info ] Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090508162350.GO29045@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: read apic ID in the !acpi_lapic caseYinghai Lu2009-05-111-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ed found that on 32-bit, boot_cpu_physical_apicid is not read right, when the mptable is broken. Interestingly, actually three paths use/set it: 1. acpi: at that time that is already read from reg 2. mptable: only read from mptable 3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit so we could read the apic id for the 2/3 path. We trust the hardware register more than we trust a BIOS data structure (the mptable). We can also avoid the double set_fixmap() when acpi_lapic is used, and also need to move cpu_has_apic earlier and call apic_disable(). Also when need to update the apic id, we'd better read and set the apic version as well - so that quirks are applied precisely. v2: make path 3 with 64bit, use -1 as apic id, so could read it later. v3: fix whitespace problem pointed out by Ed Swierk [ Impact: get correct apic id for bsp other than acpi path ] Reported-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <49FC85A9.2070702@kernel.org> [ v4: sanity-check in the ACPI case too ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: apic: Check rev 3 fadt correctly for physical_apic bitYinghai Lu2009-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix fadt version checking FADT2_REVISION_ID has value 3 aka rev 3 FADT. So need to use >= instead of >, as other places in the code do. [ Impact: extend scope of APIC boot quirk ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | Merge commit 'v2.6.30-rc5' into x86/apicIngo Molnar2009-05-1123-73/+121
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: this branch was on a .30-rc2 base - sync it up with all the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: uv - prevent NULL dereference in uv_system_init()Cyrill Gorcunov2009-05-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We may reach NULL dereference oops if kmalloc failed. Prevent it with explicit BUG_ON. [ Impact: more controlled assert in 'impossible' scenario ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20090501202511.GE4633@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: uv io-apic - use BUILD_BUG_ON instead of BUG_ONCyrill Gorcunov2009-05-031-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The expression is known to be true/false at compilation time so we're allowed to use build-time instead of run-time check. Also align 'entry' items assignment. [ Impact: shrink kernel a bit, cleanup ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20090502093956.GB4791@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, apic: use pr_ macroCyrill Gorcunov2009-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace recenly appeared printk with pr_ macro (the file already use a lot of them). [ Impact: cleanup ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090501195425.GB4633@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/pci: update pirq_enable_irq() to setup io apic routingYinghai Lu2009-05-111-76/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So we can set io apic routing only when enabling the device irq. This is advantageous for IRQ descriptor allocation affinity: if we set up the IO-APIC entry later, we have a chance to allocate the IRQ descriptor later and know which device it is on and can set affinity accordingly. [ Impact: standardize/enhance irq-enabling sequence for mptable irqs ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A01C46E.8000501@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/acpi: move setup io apic routing out of CONFIG_ACPI scopeYinghai Lu2009-05-111-61/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So we could set io apic routing when ACPI is not enabled. [ Impact: prepare for new functionality ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A01C422.5070400@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()Yinghai Lu2009-05-111-48/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare those params for pcibios_irq_enable() to call setup_io_apic_routing(). [ Impact: extend function call API to prepare for new functionality ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A01C406.2040303@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/acpi: move pin_programmed bit map to io_apic.cYinghai Lu2009-05-112-17/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare to call setup_io_apic_routing() in pcibios_irq_enable() also remove not needed member apic_id. [ Impact: clean up, prepare for future change ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A01C3DD.3050104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/acpi: call mp_config_acpi_gsi() in mp_register_gsi()Yinghai Lu2009-05-111-26/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch to call mp_config_acpi_gsi() from the ACPI IRQ registration code never got mainline because there were open discussions about it. This call is needed to properly update the kernel's copy of the mptable, when the update_mptable boot parameter is needed. Now that the dust has settled with the APIC unification, and since there were no objections when the patch was re-submitted, try this again. [ Impact: fix the update_mptable boot parameter ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A01C387.7090103@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: fix alloc_mptable()Yinghai Lu2009-05-111-16/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the conditions when we stop updating the mptable due to running out of slots. [ Impact: fix memory corruption / non-working update_mptable boot parameter ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A01C3BB.1000609@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86/acpi: remove irq-compression trick on 32-bitYinghai Lu2009-05-111-58/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already have a per cpu vector on 32-bit via recent changes, and don't need this trick any more (which trick obfuscates the real GSI mappings and which only triggers on larger systems to begin with): On 3 ioapic system (24 per ioapic) before patch I got: ACPI: PCI Interrupt Link [ILSB] enabled at IRQ 71 IOAPIC[2]: Set routing entry (10-23 -> 0xa9 -> IRQ 64 Mode:1 Active:1) pci 0000:80:01.1: PCI INT A -> Link[ILSB] -> GSI 71 (level, low) -> IRQ 64 ACPI: PCI Interrupt Link [LE5B] enabled at IRQ 67 IOAPIC[2]: Set routing entry (10-19 -> 0xb1 -> IRQ 65 Mode:1 Active:1) pci 0000:83:00.0: PCI INT B -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 65 ACPI: PCI Interrupt Link [LE5A] enabled at IRQ 66 IOAPIC[2]: Set routing entry (10-18 -> 0xb9 -> IRQ 66 Mode:1 Active:1) pci 0000:83:00.1: PCI INT A -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 ACPI: PCI Interrupt Link [LE5D] enabled at IRQ 65 IOAPIC[2]: Set routing entry (10-17 -> 0xc1 -> IRQ 67 Mode:1 Active:1) pci 0000:84:00.0: PCI INT B -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 67 ACPI: PCI Interrupt Link [LE5C] enabled at IRQ 64 IOAPIC[2]: Set routing entry (10-16 -> 0xc9 -> IRQ 68 Mode:1 Active:1) pci 0000:84:00.1: PCI INT A -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 68 pci 0000:87:00.0: PCI INT B -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 pci 0000:87:00.1: PCI INT A -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 67 pci 0000:88:00.0: PCI INT B -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 68 pci 0000:88:00.1: PCI INT A -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 65 pci 0000:8b:00.0: PCI INT B -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 pci 0000:8b:00.1: PCI INT A -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 67 pci 0000:8c:00.0: PCI INT B -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 68 pci 0000:8c:00.1: PCI INT A -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 65 after the patch we get: ACPI: PCI Interrupt Link [ILSB] enabled at IRQ 71 IOAPIC[2]: Set routing entry (10-23 -> 0xa9 -> IRQ 71 Mode:1 Active:1) pci 0000:80:01.1: PCI INT A -> Link[ILSB] -> GSI 71 (level, low) -> IRQ 71 ACPI: PCI Interrupt Link [LE5B] enabled at IRQ 67 IOAPIC[2]: Set routing entry (10-19 -> 0xb1 -> IRQ 67 Mode:1 Active:1) pci 0000:83:00.0: PCI INT B -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 67 ACPI: PCI Interrupt Link [LE5A] enabled at IRQ 66 IOAPIC[2]: Set routing entry (10-18 -> 0xb9 -> IRQ 66 Mode:1 Active:1) pci 0000:83:00.1: PCI INT A -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 ACPI: PCI Interrupt Link [LE5D] enabled at IRQ 65 IOAPIC[2]: Set routing entry (10-17 -> 0xc1 -> IRQ 65 Mode:1 Active:1) pci 0000:84:00.0: PCI INT B -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 65 ACPI: PCI Interrupt Link [LE5C] enabled at IRQ 64 IOAPIC[2]: Set routing entry (10-16 -> 0xc9 -> IRQ 64 Mode:1 Active:1) pci 0000:84:00.1: PCI INT A -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 64 pci 0000:87:00.0: PCI INT B -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 pci 0000:87:00.1: PCI INT A -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 65 pci 0000:88:00.0: PCI INT B -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 64 pci 0000:88:00.1: PCI INT A -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 67 pci 0000:8b:00.0: PCI INT B -> Link[LE5A] -> GSI 66 (level, low) -> IRQ 66 pci 0000:8b:00.1: PCI INT A -> Link[LE5D] -> GSI 65 (level, low) -> IRQ 65 pci 0000:8c:00.0: PCI INT B -> Link[LE5C] -> GSI 64 (level, low) -> IRQ 64 pci 0000:8c:00.1: PCI INT A -> Link[LE5B] -> GSI 67 (level, low) -> IRQ 67 As it can be seen that GSIs now get mapped lineary. [ Impact: simplify irq number mapping on bigger 32-bit systems ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4A01C35C.7060207@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | Merge branch 'x86/apic' into irq/numaIngo Molnar2009-05-0114-569/+442
| |\ \ \ \ \ | | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/apic/io_apic.c Merge reason: non-trivial interaction between ongoing work in io_apic.c and the NUMA migration feature in the irq tree. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: Use dmi check in apic_is_clustered() on 64-bit to mark the TSC unstableYinghai Lu2009-04-271-27/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will have systems with 2 and more sockets 8cores/2thread, but we treat them as multi chassis - while they could have a stable TSC domain. Use DMI check instead. [ Impact: do not turn possibly stable TSCs off incorrectly ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Ravikiran Thirumalai <kiran@scalex86.org> LKML-Reference: <49F5532A.5000802@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: apic: Remove duplicated macrosYinghai Lu2009-04-271-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XAPIC_DEST_* is dupliicated to the one in apicdef.h [ Impact: cleanup ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <49F552D0.5050505@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: x2apic, IR: remove reinit_intr_remapped_IO_APIC()Suresh Siddha2009-04-222-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When interrupt-remapping is enabled, we are relying on setup_IO_APIC_irqs() to configure remapped entries in the IO-APIC, which comes little bit later after enabling interrupt-remapping. Meanwhile, restoration of old io-apic entries after enabling interrupt-remapping will not make the interrupts through io-apic functional anyway. So remove the unnecessary reinit_intr_remapped_IO_APIC() step. The longer story: When interrupt-remapping is enabled, IO-APIC entries need to be setup in the re-mappable format (pointing to interrupt-remapping table entries setup by the OS). This remapping configuration is happening in the same place where we traditionally configure IO-APIC (i.e., in setup_IO_APIC_irqs()). So when we enable interrupt-remapping successfully, there is no need to restore old io-apic RTE entries before we actually do a complete configuration shortly in setup_IO_APIC_irqs(). Old IO-APIC RTE's may be in traditional format (non re-mappable) or in re-mappable format pointing to interrupt-remapping table entries setup by BIOS. Restoring both of these will not make IO-APIC functional. We have to rely on setup_IO_APIC_irqs() for proper configuration by OS. So I am removing this unnecessary and broken step. [ Impact: remove unnecessary/broken IO-APIC setup step ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Weidong Han <weidong.han@intel.com> Cc: dwmw2@infradead.org LKML-Reference: <20090420200450.552359000@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: x2apic, IR: Clean up panic() with nox2apic boot optionSuresh Siddha2009-04-211-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of panic() ignore the "nox2apic" boot option when BIOS has already enabled x2apic prior to OS handover. [ Impact: printk warning instead of panic() when BIOS has enabled x2apic already ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: dwmw2@infradead.org Cc: Weidong Han <weidong.han@intel.com> LKML-Reference: <20090420200450.425091000@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: x2apic, IR: Move eoi_ioapic_irq() into a CONFIG_INTR_REMAP sectionSuresh Siddha2009-04-211-33/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address the following complier warning: arch/x86/kernel/apic/io_apic.c:2543: warning: `eoi_ioapic_irq' defined but not used By moving that function (and eoi_ioapic_irq()) into an existing #ifdef CONFIG_INTR_REMAP section of the code. [ Impact: cleanup ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: dwmw2@infradead.org Cc: Weidong Han <weidong.han@intel.com> LKML-Reference: <20090420200450.271099000@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Weidong Han <weidong.han@intel.com>
| | * | | | x86: x2apic, IR: Clean up X86_X2APIC and INTR_REMAP config checksSuresh Siddha2009-04-213-40/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add x2apic_supported() to clean up CONFIG_X86_X2APIC checks. Fix CONFIG_INTR_REMAP checks. [ Impact: cleanup ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: dwmw2@infradead.org Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Weidong Han <weidong.han@intel.com> LKML-Reference: <20090420200450.128993000@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: es7000, uv - use __cpuinit for kicking secondary cpusCyrill Gorcunov2009-04-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The caller already has __cpuinit attribute. [ Impact: save memory, address section mismatch warning ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Pavel Emelyanov <xemul@openvz.org> LKML-Reference: <20090419074311.GA8670@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: smpboot - wakeup_secondary should be done via __cpuinit sectionCyrill Gorcunov2009-04-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A caller (do_boot_cpu) already has __cpuinit attribute. Since HOTPLUG_CPU depends on SMP && HOTPLUG it doesn't lead to panic at moment. [ Impact: cleanup ] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090418194528.GD25510@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, intr-remap: fix x2apic/intr-remap resumeWeidong Han2009-04-191-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Interrupt remapping was decoupled from x2apic. Shouldn't check x2apic before resume interrupt remapping. Otherwise, interrupt remapping won't be resumed when x2apic is not enabled. [ Impact: fix potential intr-remap resume hang on !x2apic ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-6-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, intr-remap: enable interrupt remapping earlyWeidong Han2009-04-191-40/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when x2apic is not enabled, interrupt remapping will be enabled in init_dmars(), where it is too late to remap ioapic interrupts, that is, ioapic interrupts are really in compatibility mode, not remappable mode. This patch always enables interrupt remapping before ioapic setup, it guarantees all interrupts will be remapped when interrupt remapping is enabled. Thus it doesn't need to set the compatibility interrupt bit. [ Impact: refactor intr-remap init sequence, enable fuller remap mode ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-4-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, intr-remap: fix ack for interrupt remappingWeidong Han2009-04-191-27/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shouldn't call ack_apic_edge() in ir_ack_apic_edge(), because ack_apic_edge() does more than just ack: it also does irq migration in the non-interrupt-remapping case. But there is no such need for interrupt-remapping case, as irq migration is done in the process context. Similarly, ir_ack_apic_level() shouldn't call ack_apic_level, and instead should do the local cpu's EOI + directed EOI to the io-apic. ack_x2APIC_irq() is not neccessary, because ack_APIC_irq() will use MSR write for x2apic, and uncached write for non-x2apic. [ Impact: simplify/standardize intr-remap IRQ acking, fix on !x2apic ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-3-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | Merge branch 'linus' into x86/apicIngo Molnar2009-04-1718-135/+232
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: new intr-remap patches depend on the s2ram iommu fixes from upstream Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | x86: use used_vectors in init_IRQ()Yinghai Lu2009-04-151-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix crash with many devices I found this crash: [ 552.616646] general protection fault: 0403 [#1] SMP [ 552.620013] last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/usb1/1-1/1-1:1.0/host13/target13:0:0/13:0:0:0/block/sr0/size [ 552.620013] CPU 0 [ 552.620013] Modules linked in: [ 552.620013] Pid: 0, comm: swapper Not tainted 2.6.30-rc1-tip-01931-g8fcafd8-dirty #28 Sun Fire X4440 [ 552.620013] RIP: 0010:[<ffffffff8023bada>] [<ffffffff8023bada>] default_idle+0x7d/0xda [ 552.620013] RSP: 0018:ffffffff81345e68 EFLAGS: 00010246 [ 552.620013] RAX: 0000000000000000 RBX: ffffffff8133d870 RCX: ffffc20000000000 [ 552.620013] RDX: 00000000001d0620 RSI: ffffffff8023bad8 RDI: ffffffff802a3169 [ 552.620013] RBP: ffffffff81345e98 R08: 0000000000000000 R09: ffffffff812244a0 [ 552.620013] R10: ffffffff81345dc8 R11: 7ebe1b6fa0bcac50 R12: 4ec4ec4ec4ec4ec5 [ 552.620013] R13: ffffffff813a54d0 R14: ffffffff813a7a40 R15: 0000000000000000 [ 552.620013] FS: 00000000006d1880(0000) GS:ffffc20000000000(0000) knlGS:0000000000000000 [ 552.620013] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 552.620013] CR2: 00007fec9d936a50 CR3: 000000007d1a9000 CR4: 00000000000006e0 [ 552.620013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 552.620013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 552.620013] Process swapper (pid: 0, threadinfo ffffffff81344000,task ffffffff812244a0) [ 552.620013] Stack: [ 552.620013] 0000000000000000 ffffc20000000000 00000000001d0620 7ebe1b6fa0bcac50 [ 552.620013] ffffffff8133d870 4ec4ec4ec4ec4ec5 ffffffff81345ec8 ffffffff8023bd84 [ 552.620013] 4ec4ec4ec4ec4ec5 ffffffff813a54d0 7ebe1b6fa0bcac50 ffffffff8133d870 [ 552.620013] Call Trace: [ 552.620013] [<ffffffff8023bd84>] c1e_idle+0x109/0x124 [ 552.620013] [<ffffffff8023314b>] cpu_idle+0xb8/0x101 [ 552.620013] [<ffffffff80c16d6a>] rest_init+0x7e/0x94 [ 552.620013] [<ffffffff81357efc>] start_kernel+0x3dc/0x3fd [ 552.620013] [<ffffffff813572a9>] x86_64_start_reservations+0xb9/0xd4 [ 552.620013] [<ffffffff813573b2>] x86_64_start_kernel+0xee/0x109 [ 552.620013] Code: 48 8b 04 25 f8 b4 00 00 83 a0 3c e0 ff ff fb 0f ae f0 65 48 8b 04 25 f8 b4 00 00 f6 80 38 e0 ff ff 08 75 09 e8 71 76 06 00 fb f4 <eb> 06 e8 68 76 06 00 fb 65 48 8b 04 25 f8 b4 00 00 83 88 3c e0 [ 552.620013] RIP [<ffffffff8023bada>] default_idle+0x7d/0xda [ 552.620013] RSP <ffffffff81345e68> [ 552.828646] ---[ end trace 4cbfc5c01382af7f ]--- Joerg Roedel said "The 0403 error code means that there was an external interrupt with vector 0x80. Yinghai, my theory is that the kernel on this machine has no 32bit emulation compiled in, right? In this case the selector points to a zero entry which may cause the #gpf right after the hlt. But I have no idea where the external int 0x80 comes from" it turns out that we could use 0x80 for external device on 64-bit when 32-bit emulation is disabled. But we forgot to set the gate for it. try to set gate for it by checking used_vectors. Also move apic_intr_init() early to avoid setting that gate two times. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <49E62DFD.6010904@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
OpenPOWER on IntegriCloud