summaryrefslogtreecommitdiffstats
path: root/kernel
diff options
context:
space:
mode:
authorWanpeng Li <wanpeng.li@hotmail.com>2016-09-02 14:38:23 +0800
committerThomas Gleixner <tglx@linutronix.de>2016-09-02 10:25:40 +0200
commit08d072599234c959b0b82b63fa252c129225a899 (patch)
tree7e39e250570300bc66e42d922fa80327aaf1e573 /kernel
parent98744b408c757901df57fa50cbd5826245dc3a1f (diff)
downloadtalos-obmc-linux-08d072599234c959b0b82b63fa252c129225a899.tar.gz
talos-obmc-linux-08d072599234c959b0b82b63fa252c129225a899.zip
tick/nohz: Fix softlockup on scheduler stalls in kvm guest
tick_nohz_start_idle() is prevented to be called if the idle tick can't be stopped since commit 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter"). As a result, after suspend/resume the host machine, full dynticks kvm guest will softlockup: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0] Call Trace: default_idle+0x31/0x1a0 arch_cpu_idle+0xf/0x20 default_idle_call+0x2a/0x50 cpu_startup_entry+0x39b/0x4d0 rest_init+0x138/0x140 ? rest_init+0x5/0x140 start_kernel+0x4c1/0x4ce ? set_init_arg+0x55/0x55 ? early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x142/0x14f In addition, cat /proc/stat | grep cpu in guest or host: cpu 398 16 5049 15754 5490 0 1 46 0 0 cpu0 206 5 450 0 0 0 1 14 0 0 cpu1 81 0 3937 3149 1514 0 0 9 0 0 cpu2 45 6 332 6052 2243 0 0 11 0 0 cpu3 65 2 328 6552 1732 0 0 11 0 0 The idle and iowait states are weird 0 for cpu0(housekeeping). The bug is present in both guest and host kernels, and they both have cpu0's idle and iowait states issue, however, host kernel's suspend/resume path etc will touch watchdog to avoid the softlockup. - The watchdog will not be touched in tick_nohz_stop_idle path (need be touched since the scheduler stall is expected) if idle_active flags are not detected. - The idle and iowait states will not be accounted when exit idle loop (resched or interrupt) if idle start time and idle_active flags are not set. This patch fixes it by reverting commit 1f3b0f8243cb934 since can't stop idle tick doesn't mean can't be idle. Fixes: 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter") Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com> Cc: Gaurav Jindal<gaurav.jindal@spreadtrum.com> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Diffstat (limited to 'kernel')
-rw-r--r--kernel/time/tick-sched.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 204fdc86863d..2ec7c00228f3 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -908,10 +908,11 @@ static void __tick_nohz_idle_enter(struct tick_sched *ts)
ktime_t now, expires;
int cpu = smp_processor_id();
+ now = tick_nohz_start_idle(ts);
+
if (can_stop_idle_tick(cpu, ts)) {
int was_stopped = ts->tick_stopped;
- now = tick_nohz_start_idle(ts);
ts->idle_calls++;
expires = tick_nohz_stop_sched_tick(ts, now, cpu);
OpenPOWER on IntegriCloud