| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
probed
commit 75013fb16f8484898eaa8d0b08fed942d790f029 upstream.
Fix to the exception table entry check by using probed address
instead of the address of copied instruction.
This bug may cause unexpected kernel panic if user probe an address
where an exception can happen which should be fixup by __ex_table
(e.g. copy_from_user.)
Unless user puts a kprobe on such address, this doesn't
cause any problem.
This bug has been introduced years ago, by commit:
464846888d9a ("x86/kprobes: Fix a bug which can modify kernel code permanently").
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 464846888d9a ("x86/kprobes: Fix a bug which can modify kernel code permanently")
Link: http://lkml.kernel.org/r/148829899399.28855.12581062400757221722.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 68dee8e2f2cacc54d038394e70d22411dee89da2 upstream.
commit 8fd524b355da ("x86: Kill bad_dma_address variable") has killed
bad_dma_address variable and used instead of macro DMA_ERROR_CODE
which is always zero. Since dma_addr is unsigned, the statement
dma_addr >= DMA_ERROR_CODE
is always true, and not needed.
arch/x86/kernel/pci-calgary_64.c: In function ‘iommu_free’:
arch/x86/kernel/pci-calgary_64.c:299:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
if (unlikely((dma_addr >= DMA_ERROR_CODE) && (dma_addr < badend))) {
Fixes: 8fd524b355da ("x86: Kill bad_dma_address variable")
Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz>
Cc: iommu@lists.linux-foundation.org
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Link: http://lkml.kernel.org/r/7612c0f9dd7c1290407dbf8e809def922006920b.1479161177.git.npajkovsky@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a9b4f08770b415f30f2fb0f8329a370c8f554aa3 upstream.
commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC
chip and did not add it to the new IO-APIC-IR irq chip.
There is no harm because the interrupts are resent in software when the
retrigger callback is NULL, but it's less efficient. So restore them.
[ tglx: Massaged changelog ]
Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: xe-linux-external@cisco.com
Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 34a477e5297cbaa6ecc6e17c042a866e1cbe80d6 upstream.
On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function
graph tracing and then suspend to RAM, it will triple fault and reboot when
it resumes.
The first fault happens when booting a secondary CPU:
startup_32_smp()
load_ucode_ap()
prepare_ftrace_return()
ftrace_graph_is_dead()
(accesses 'kill_ftrace_graph')
The early head_32.S code calls into load_ucode_ap(), which has an an
ftrace hook, so it calls prepare_ftrace_return(), which calls
ftrace_graph_is_dead(), which tries to access the global
'kill_ftrace_graph' variable with a virtual address, causing a fault
because the CPU is still in real mode.
The fix is to add a check in prepare_ftrace_return() to make sure it's
running in protected mode before continuing. The check makes sure the
stack pointer is a virtual kernel address. It's a bit of a hack, but
it's not very intrusive and it works well enough.
For reference, here are a few other (more difficult) ways this could
have potentially been fixed:
- Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging
is enabled. (No idea what that would break.)
- Track down load_ucode_ap()'s entire callee tree and mark all the
functions 'notrace'. (Probably not realistic.)
- Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu()
or __cpu_up(), and ensure that the pause facility can be queried from
real mode.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 29f72ce3e4d18066ec75c79c857bee0618a3504b upstream.
MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.
Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.
Tested on AMD Fam15h and Fam17h systems.
WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
kobject: (ffff88085bb256c0): attempted to be registered with empty name!
...
Call Trace:
kobject_add_internal
kobject_add
kobject_create_and_add
threshold_create_device
threshold_init_device
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2 upstream.
The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
[..]
Call Trace:
dump_stack
___might_sleep
__might_sleep
mutex_lock_nested
? __lock_acquire
nfit_handle_mce
notifier_call_chain
atomic_notifier_call_chain
? atomic_notifier_call_chain
mce_gen_pool_process
Convert the notifier to a blocking one which gets to run only in process
context.
Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure
they go out and get logged at least.
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 7f00f388712b29005782bad7e4b25942620f3b9c upstream.
The schemata lock is released before freeing the resource's temporary
tmp_cbms allocation. That's racy versus another write which allocates and
uses new temporary storage, resulting in memory leaks, freeing in use
memory, double a free or any combination of those.
Move the unlock after the release code.
Fixes: 60ec2440c63d ("x86/intel_rdt: Add schemata file")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Shaohua Li <shli@fb.com>
Link: http://lkml.kernel.org/r/20170411071446.15241-1-jolsa@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit cfac6dfa42bddfa9711b20d486e521d1a41ab09f upstream.
Put the right values from the original siginfo into the
userspace compat-siginfo.
This fixes the 32-bit MPX "tabletest" testcase on 64-bit kernels.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a4455082dc6f0 ('x86/signals: Add missing signal_compat code for x86 features')
Link: http://lkml.kernel.org/r/1491322501-5054-1-git-send-email-joro@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit bba8376aea1dcbbe22bbda118c52abee317c7609 ]
The reboot quirk for ASUS EeeBook X205TA contains a typo in
DMI_PRODUCT_NAME, improperly referring to X205TAW instead of
X205TA, which prevents the quirk from being triggered. The
model X205TAW already has a reboot quirk of its own.
This fix simply removes the inappropriate final letter W.
Fixes: 90b28ded88dd ("x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk")
Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Link: http://lkml.kernel.org/r/1489064417-7445-1-git-send-email-matjaz.hegedic@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 3b3e78552d3077ec70d2640e629e07e3ab416a6a ]
Without the parameter reboot=a, ASUS EeeBook X205TA/W will hang
when it should reboot. This adds the appropriate quirk, thus
fixing the problem.
Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Link: http://lkml.kernel.org/r/1488737804-20681-1-git-send-email-matjaz.hegedic@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 90b28ded88dda8bea82b4a86923e73ba0746d884 ]
Without the parameter reboot=a, ASUS EeeBook X205TA will hang when it should reboot.
This adds the appropriate quirk, thus fixing the problem.
Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit cc66afea58f858ff6da7f79b8a595a67bbb4f9a9 upstream.
Since:
cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers")
all MCEs are printed even when mcelog is running. Fix the regression to
not print to dmesg when mcelog is running as it is a consumer too.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers")
Link: http://lkml.kernel.org/r/20170327093304.10683-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 49ec8f5b6ae3ab60385492cad900ffc8a523c895 upstream.
The rdtgroup_kn_unlock waits for the last user to release and put its
node. But it's calling kernfs_put on the node which calls the
rdtgroup_kn_unlock, which might not be the group's directory node, but
another group's file node.
This race could be easily reproduced by running 2 instances
of following script:
mount -t resctrl resctrl /sys/fs/resctrl/
pushd /sys/fs/resctrl/
mkdir krava
echo "krava" > krava/schemata
rmdir krava
popd
umount /sys/fs/resctrl
It triggers the slub debug error message with following command
line config: slub_debug=,kernfs_node_cache.
Call kernfs_put on the group's node to fix it.
Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Shaohua Li <shli@fb.com>
Link: http://lkml.kernel.org/r/1489501253-20248-1-git-send-email-jolsa@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.
The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.
Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().
Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: lkp@01.org
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 44fee88cea43d3c2cac962e0439cb10a3cabff6d upstream.
Subhransu reported that convert_art_to_tsc() isn't working for him.
The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.
Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.
The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.
Note for stable: The conditional has changed from TSC_RELIABLE to
TSC_KNOWN_FREQUENCY.
[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]
Fixes: f9677e0f8308 ("x86/tsc: Always Running Timer (ART) correlated clocksource")
Reported-by: "Prusty, Subhransu S" <subhransu.s.prusty@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: christopher.s.hall@intel.com
Cc: kevin.b.stanton@intel.com
Cc: john.stultz@linaro.org
Cc: akataria@vmware.com
Link: http://lkml.kernel.org/r/20170313145712.GI3312@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 87a6b2975f0d340c75b7488d22d61d2f98fb8abf upstream.
Pavel Machek reported the following warning on x86-32:
WARNING: kernel stack frame pointer at f50cdf98 in swapper/2:0 has bad value (null)
The warning is caused by the unwinder not realizing that it reached the
end of the stack, due to an unusual prologue which gcc sometimes
generates for aligned stacks. The prologue is based on a gcc feature
called the Dynamic Realign Argument Pointer (DRAP). It's almost always
enabled for aligned stacks when -maccumulate-outgoing-args isn't set.
This issue is similar to the one fixed by the following commit:
8023e0e2a48d ("x86/unwind: Adjust last frame check for aligned function stacks")
... but that fix was specific to x86-64.
Make the fix more generic to cover x86-32 as well, and also ensure that
the return address referred to by the frame pointer is a copy of the
original return address.
Fixes: acb4608ad186 ("x86/unwind: Create stack frames for saved syscall registers")
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/50d4924db716c264b14f1633037385ec80bf89d2.1489465609.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCC complains about unused variable 'vma' in mark_screen_rdonly() if THP is
disabled:
arch/x86/kernel/vm86_32.c: In function ‘mark_screen_rdonly’:
arch/x86/kernel/vm86_32.c:180:26: warning: unused variable ‘vma’
[-Wunused-variable]
struct vm_area_struct *vma = find_vma(mm, 0xA0000);
That's silly. pmd_trans_huge() resolves to 0 when THP is disabled, so the
whole block should be eliminated.
Moving the variable declaration outside the if() block shuts GCC up.
Reported-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Carlos O'Donell <carlos@redhat.com>
Link: http://lkml.kernel.org/r/20170213125228.63645-1-kirill.shutemov@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Last minute x86 fixes:
- Fix a softlockup detector warning and long delays if using ptdump
with KASAN enabled.
- Two more TSC-adjust fixes for interesting firmware interactions.
- Two commits to fix an AMD CPU topology enumeration bug that caused
a measurable gaming performance regression"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ptdump: Fix soft lockup in page table walker
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST
x86/CPU/AMD: Fix Zen SMT topology
x86/CPU/AMD: Bring back Compute Unit ID
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When the TSC is marked reliable then the synchronization check is skipped,
but that also skips the TSC ADJUST sanitizing code. So on a machine with a
wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
Let the TSC adjust sanitizing code run unconditionally and just skip the
expensive synchronization checks when TSC is marked reliable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Olof Johansson <olof@lixom.net>
Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Olof reported that on a machine which has a BIOS wreckaged TSC the
timestamps in dmesg are making a large jump because the TSC value is
jumping forward after resetting the TSC ADJUST register to a sane value.
This can be avoided by calling the TSC ADJUST saniziting function before
initializing the per cpu sched clock machinery. That takes the offset into
account and avoid the time jump.
What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
CPUs are printed with the large time offsets because it would be too much
effort and ugly hackery to print those warnings into a buffer and emit them
after the adjustemt on the starting CPUs. It's a firmware bug and should be
fixed in firmware. The weird timestamps are collateral damage and just
illustrate the sillyness of the BIOS folks:
[ 0.397445] smp: Bringing up secondary CPUs ...
[ 0.402100] x86: Booting SMP configuration:
[ 0.406343] .... node #0, CPUs: #1
[1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101
[1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101
[ 0.508119] #2
[1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677
[1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677
[ 0.607643] #3
[1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530
[1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530
[ 0.707108] smp: Brought up 1 node, 4 CPUs
[ 0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After:
a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")
our SMT scheduling topology for Fam17h systems is broken, because
the ThreadId is included in the ApicId when SMT is enabled.
So, without further decoding cpu_core_id is unique for each thread
rather than the same for threads on the same core. This didn't affect
systems with SMT disabled. Make cpu_core_id be what it is defined to be.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.9
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit:
a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")
restored the initial approach we had with the Fam15h topology of
enumerating CU (Compute Unit) threads as cores. And this is still
correct - they're beefier than HT threads but still have some
shared functionality.
Our current approach has a problem with the Mad Max Steam game, for
example. Yves Dionne reported a certain "choppiness" while playing on
v4.9.5.
That problem stems most likely from the fact that the CU threads share
resources within one CU and when we schedule to a thread of a different
compute unit, this incurs latency due to migrating the working set to a
different CU through the caches.
When the thread siblings mask mirrors that aspect of the CUs and
threads, the scheduler pays attention to it and tries to schedule within
one CU first. Which takes care of the latency, of course.
Reported-by: Yves Dionne <yves.dionne@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.9
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Link: http://lkml.kernel.org/r/20170205105022.8705-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commit 020eb3daaba2857b32c4cf4c82f503d6a00a67de.
Gabriel C reports that it causes his machine to not boot, and we haven't
tracked down the reason for it yet. Since the bug it fixes has been
around for a longish time, we're better off reverting the fix for now.
Gabriel says:
"It hangs early and freezes with a lot RCU warnings.
I bisected it down to :
> Ruslan Ruslichenko (1):
> x86/ioapic: Restore IO-APIC irq_chip retrigger callback
Reverting this one fixes the problem for me..
The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed"
and Ruslan and Thomas are currently stumped.
Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org # for the backport of the original commit
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
- Prevent double activation of interrupt lines, which causes problems
on certain interrupt controllers
- Handle the fallout of the above because x86 (ab)uses the activation
function to reconfigure interrupts under the hood.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Make irq activate operations symmetric
irqdomain: Avoid activating interrupts more than once
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The recent commit which prevents double activation of interrupts unearthed
interesting code in x86. The code (ab)uses irq_domain_activate_irq() to
reconfigure an already activated interrupt. That trips over the prevention
code now.
Fix it by deactivating the interrupt before activating the new configuration.
Fixes: 08d85f3ea99f1 "irqdomain: Avoid activating interrupts more than once"
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311901580.3457@nanos
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Erik reported that on a preproduction hardware a CMCI storm triggers the
BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is
started by the CMCI logic before the MCE CPU hotplug callback starts the
timer with add_timer_on(). So the timer is already queued which triggers
the BUG.
Using add_timer_on() is pretty pointless in this code because the timer is
strictlty per CPU, initialized as pinned and all operations which arm the
timer happen on the CPU to which the timer belongs.
Simplify the whole machinery by using mod_timer() instead of add_timer_on()
which avoids the problem because mod_timer() can handle already queued
timers. Use __start_timer() everywhere so the earliest armed expiry time is
preserved.
Reported-by: Erik Veijola <erik.veijola@intel.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we look for microcode blobs, we first try builtin and if that
doesn't succeed, we fallback to the initrd supplied to the kernel.
However, at some point doing boot, that initrd gets jettisoned and we
shouldn't access it anymore. But we do, as the below KASAN report shows.
That's because find_microcode_in_initrd() doesn't check whether the
initrd is still valid or not.
So do that.
==================================================================
BUG: KASAN: use-after-free in find_cpio_data
Read of size 1 by task swapper/1/0
page:ffffea0000db9d40 count:0 mapcount:0 mapping: (null) index:0x1
flags: 0x100000000000000()
raw: 0100000000000000 0000000000000000 0000000000000001 00000000ffffffff
raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.10.0-rc5-debug-00075-g2dbde22 #3
Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.2.3 12/01/2016
Call Trace:
dump_stack
? _atomic_dec_and_lock
? __dump_page
kasan_report_error
? pointer
? find_cpio_data
__asan_report_load1_noabort
? find_cpio_data
find_cpio_data
? vsprintf
? dump_stack
? get_ucode_user
? print_usage_bug
find_microcode_in_initrd
__load_ucode_intel
? collect_cpu_info_early
? debug_check_no_locks_freed
load_ucode_intel_ap
? collect_cpu_info
? trace_hardirqs_on
? flat_send_IPI_mask_allbutself
load_ucode_ap
? get_builtin_firmware
? flush_tlb_func
? do_raw_spin_trylock
? cpumask_weight
cpu_init
? trace_hardirqs_off
? play_dead_common
? native_play_dead
? hlt_play_dead
? syscall_init
? arch_cpu_idle_dead
? do_idle
start_secondary
start_cpu
Memory state around the buggy address:
ffff880036e74f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff880036e74f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff880036e75000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff880036e75080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff880036e75100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170126165833.evjemhbqzaepirxo@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The compacted-format XSAVES area is determined at boot time and
never changed after. The field xsave.header.xcomp_bv indicates
which components are in the fixed XSAVES format.
In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
format since at the time there is no valid data.
However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
as in commit:
b22cbe404a9c x86/fpu: Fix invalid FPU ptrace state after execve()
and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
app, a #GP occurs. This can be easily triggered by doing valgrind on
a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
others:
https://bugzilla.kernel.org/show_bug.cgi?id=190061
Fix it by setting xcomp_bv correctly.
This patch also moves the xcomp_bv initialization to the proper
place, which was in copyin_to_xsaves() as of:
4c833368f0bf x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
which fixed the bug too, but it's more efficient and cleaner to
initialize things once per boot, not for every signal handling
operation.
Reported-by: Kevin Hao <haokexin@gmail.com>
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: haokexin@gmail.com
Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com
[ Combined it with 4c833368f0bf. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I got the following calltrace on a Apollo Lake SoC with 32-bit kernel:
WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 fpu__restore+0x1f5/0x260
[...]
Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS APLIRVPA.X64.0138.B35.1608091058 08/09/2016
Call Trace:
dump_stack()
__warn()
? fpu__restore()
warn_slowpath_null()
fpu__restore()
__fpu__restore_sig()
fpu__restore_sig()
restore_sigcontext.isra.9()
sys_sigreturn()
do_int80_syscall_32()
entry_INT80_32()
The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in the copyin_to_xsaves() function.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haokexin@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was meant to save us the scanning of the microcode containter in
the initrd since the first AP had already done that but it can also hurt
us:
Imagine a single hyperthreaded CPU (Intel(R) Atom(TM) CPU N270, for
example) which updates the microcode on the BSP but since the microcode
engine is shared between the two threads, the update on CPU1 doesn't
happen because it has already happened on CPU0 and we don't find a newer
microcode revision on CPU1.
Which doesn't set the intel_ucode_patch pointer and at initrd
jettisoning time we don't save the microcode patch for later
application.
Now, when we suspend to RAM, the loaded microcode gets cleared so we
need to reload but there's no patch saved in the cache.
Removing the optimization fixes this issue and all is fine and dandy.
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170120202955.4091-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC
chip and did not add it to the new IO-APIC-IR irq chip.
Unfortunately the software resend fallback is not enabled on X86, so edge
interrupts which are received during the lazy disabled state of the
interrupt line are not retriggered and therefor lost.
Restore the callbacks.
[ tglx: Massaged changelog ]
Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: xe-linux-external@cisco.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- unwinder fixes
- AMD CPU topology enumeration fixes
- microcode loader fixes
- x86 embedded platform fixes
- fix for a bootup crash that may trigger when clearcpuid= is used
with invalid values"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mpx: Use compatible types in comparison to fix sparse error
x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc()
x86/entry: Fix the end of the stack for newly forked tasks
x86/unwind: Include __schedule() in stack traces
x86/unwind: Disable KASAN checks for non-current tasks
x86/unwind: Silence warnings for non-current tasks
x86/microcode/intel: Use correct buffer size for saving microcode data
x86/microcode/intel: Fix allocation size of struct ucode_patch
x86/microcode/intel: Add a helper which gives the microcode revision
x86/microcode: Use native CPUID to tickle out microcode revision
x86/CPU: Add native CPUID variants returning a single datum
x86/boot: Add missing declaration of string functions
x86/CPU/AMD: Fix Bulldozer topology
x86/platform/intel-mid: Rename 'spidev' to 'mrfld_spidev'
x86/cpu: Fix typo in the comment for Anniedale
x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Intel Denverton microserver uses a 25 MHz TSC crystal,
so we can derive its exact [*] TSC frequency
using CPUID and some arithmetic, eg.:
TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000)
[*] 'exact' is only as good as the crystal, which should be +/- 20ppm
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/306899f94804aece6d8fa8b4223ede3b48dbb59c.1484287748.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.
These cases seem to be mostly harmless. The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.
In such cases, it's possible that the unwinder may read a KASAN-poisoned
region of the stack. Account for that by using READ_ONCE_NOCHECK() when
reading the stack of another task.
Use READ_ONCE() when reading the stack of the current task, since KASAN
warnings can still be useful for finding bugs in that case.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.
These cases seem to be mostly harmless. The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.
Since stack "corruption" on another task's stack isn't necessarily a
bug, silence the warnings when unwinding tasks other than current.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In generic_load_microcode(), curr_mc_size is the size of the last
allocated buffer and since we have this performance "optimization"
there to vmalloc a new buffer only when the current one is bigger,
curr_mc_size ends up becoming the size of the biggest buffer we've seen
so far.
However, we end up saving the microcode patch which matches our CPU
and its size is not curr_mc_size but the respective mc_size during the
iteration while we're staring at it.
So save that mc_size into a separate variable and use it to store the
previously found microcode buffer.
Without this fix, we could get oops like this:
BUG: unable to handle kernel paging request at ffffc9000e30f000
IP: __memcpy+0x12/0x20
...
Call Trace:
? kmemdup+0x43/0x60
__alloc_microcode_buf+0x44/0x70
save_microcode_patch+0xd4/0x150
generic_load_microcode+0x1b8/0x260
request_microcode_user+0x15/0x20
microcode_write+0x91/0x100
__vfs_write+0x34/0x120
vfs_write+0xc1/0x130
SyS_write+0x56/0xc0
do_syscall_64+0x6c/0x160
entry_SYSCALL64_slow_path+0x25/0x25
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/4f33cbfd-44f2-9bed-3b66-7446cd14256f@ce.jp.nec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We allocate struct ucode_patch here. @size is the size of microcode data
and used for kmemdup() later in this function.
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/7a730dc9-ac17-35c4-fe76-dfc94e5ecd95@ce.jp.nec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Since on Intel we're required to do CPUID(1) first, before reading
the microcode revision MSR, let's add a special helper which does the
required steps so that we don't forget to do them next time, when we
want to read the microcode revision.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-4-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Intel supplies the microcode revision value in MSR 0x8b
(IA32_BIOS_SIGN_ID) after CPUID(1) has been executed. Execute it each
time before reading that MSR.
It used to do sync_core() which did do CPUID but
c198b121b1a1 ("x86/asm: Rewrite sync_core() to use IRET-to-self")
changed the sync_core() implementation so we better make the microcode
loading case explicit, as the SDM documents it.
Reported-and-tested-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-3-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The following commit:
8196dab4fc15 ("x86/cpu: Get rid of compute_unit_id")
... broke the initial strategy for Bulldozer-based cores' topology,
where we consider each thread of a compute unit a standalone core
and not a HT or SMT thread.
Revert to the firmware-supplied core_id numbering and do not make
them thread siblings as we don't consider them for such even if they
technically are, more or less.
Reported-and-tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Tested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8196dab4fc15 ("x86/cpu: Get rid of compute_unit_id")
Link: http://lkml.kernel.org/r/20170105092638.5247-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
command-line option
A negative number can be specified in the cmdline which will be used as
setup_clear_cpu_cap() argument. With that we can clear/set some bit in
memory predceeding boot_cpu_data/cpu_caps_cleared which may cause kernel
to misbehave. This patch adds lower bound check to setup_disablecpuid().
Boris Petkov reproduced a crash:
[ 1.234575] BUG: unable to handle kernel paging request at ffffffff858bd540
[ 1.236535] IP: memcpy_erms+0x6/0x10
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andi.kleen@intel.com
Cc: bp@alien8.de
Cc: dave.hansen@linux.intel.com
Cc: luto@kernel.org
Cc: slaoub@gmail.com
Fixes: ac72e7888a61 ("x86: add generic clearcpuid=... option")
Link: http://lkml.kernel.org/r/1482933340-11857-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fixes from Konrad Rzeszutek Wilk:
"This has one fix to make i915 work when using Xen SWIOTLB, and a
feature from Geert to aid in debugging of devices that can't do DMA
outside the 32-bit address space.
The feature from Geert is on top of v4.10 merge window commit
(specifically you pulling my previous branch), as his changes were
dependent on the Documentation/ movement patches.
I figured it would just easier than me trying than to cherry-pick the
Documentation patches to satisfy git.
The patches have been soaking since 12/20, albeit I updated the last
patch due to linux-next catching an compiler error and adding an
Tested-and-Reported-by tag"
* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: Export swiotlb_max_segment to users
swiotlb: Add swiotlb=noforce debug option
swiotlb: Convert swiotlb_force from int to enum
x86, swiotlb: Simplify pci_swiotlb_detect_override()
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
At the end of the function, the local variable use_swiotlb has always
the same value as the global variable swiotlb. Hence drop the local
variable completely.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If mce_device_init() fails then the mce device pointer is NULL and the
AMD mce code happily dereferences it.
Add a sanity check.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There is no point in having an extra type for extra confusion. u64 is
unambiguous.
Conversion was done with the following coccinelle script:
@rem@
@@
-typedef u64 cycle_t;
@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.
Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.
Fixes: 8fba38c937cd ("x86/msr: Convert to hotplug state machine")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
|