summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/process.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-masterPaolo Bonzini2013-12-201-16/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch queue for 3.13 - 2013-12-18 This fixes some grave issues we've only found after 3.13-rc1: - Make the modularized HV/PR book3s kvm work well as modules - Fix some race conditions - Fix compilation with certain compilers (booke) - Fix THP for book3s_hv - Fix preemption for book3s_pr Alexander Graf (4): KVM: PPC: Book3S: PR: Don't clobber our exit handler id KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Enable interrupts earlier Aneesh Kumar K.V (1): powerpc: book3s: kvm: Don't abuse host r2 in exit path Paul Mackerras (5): KVM: PPC: Book3S HV: Fix physical address calculations KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Don't drop low-order page address bits Scott Wood (1): powerpc/kvm/booke: Fix build break due to stack frame size warning pingfan liu (1): powerpc: kvm: fix rare but potential deadlock scene
| * powerpc/kvm/booke: Fix build break due to stack frame size warningScott Wood2013-12-111-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ce11e48b7fdd256ec68b932a89b397a790566031 ("KVM: PPC: E500: Add userspace debug stub support") added "struct thread_struct" to the stack of kvmppc_vcpu_run(). thread_struct is 1152 bytes on my build, compared to 48 bytes for the recently-introduced "struct debug_reg". Use the latter instead. This fixes the following error: cc1: warnings being treated as errors arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run': arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes make[2]: *** [arch/powerpc/kvm/booke.o] Error 1 make[1]: *** [arch/powerpc/kvm] Error 2 make[1]: *** Waiting for unfinished jobs.... Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
* | Merge branch 'merge' of ↵Linus Torvalds2013-11-221-10/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull third set of powerpc updates from Benjamin Herrenschmidt: "This is a small collection of random bug fixes and a few improvements of Oops output which I deemed valuable enough to include as well. The fixes are essentially recent build breakage and regressions, and a couple of older bugs such as the DTL log duplication, the EEH issue with PCI_COMMAND_MASTER and the problem with small contexts passed to get/set_context with VSX enabled" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/signals: Mark VSX not saved with small contexts powerpc/pseries: Fix SMP=n build of rng.c powerpc: Make cpu_to_chip_id() available when SMP=n powerpc/vio: Fix a dma_mask issue of vio powerpc: booke: Fix build failures powerpc: ppc64 address space capped at 32TB, mmap randomisation disabled powerpc: Only print PACATMSCRATCH in oops when TM is active powerpc/pseries: Duplicate dtl entries sometimes sent to userspace powerpc: Remove a few lines of oops output powerpc: Print DAR and DSISR on machine check oopses powerpc: Fix __get_user_pages_fast() irq handling powerpc/eeh: More accurate log powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges
| * | powerpc: Only print PACATMSCRATCH in oops when TM is activeAnton Blanchard2013-11-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | If TM is not active there is no need to print PACATMSCRATCH so we can save ourselves a line. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc: Remove a few lines of oops outputAnton Blanchard2013-11-211-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We waste quite a few lines in our oops output: ... MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 28044024 XER: 00000000 SOFTE: 0 CFAR: 0000000000009088 DAR: 000000000000001c, DSISR: 40000000 GPR00: c0000000000c74f0 c00000037cc1b010 c000000000d2bb30 0000000000000000 ... We can do a better job here and remove 3 lines: MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 28044024 XER: 00000000 CFAR: 0000000000009088 DAR: 0000000000000010, DSISR: 40000000 SOFTE: 1 GPR00: c0000000000e3d10 c00000037cc2fda0 c000000000d2c3a8 0000000000000001 Also move PACATMSCRATCH up, it doesn't really belong in the stack trace section. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc: Print DAR and DSISR on machine check oopsesAnton Blanchard2013-11-211-1/+1
| |/ | | | | | | | | | | | | | | Machine check exceptions set DAR and DSISR, so print them in our oops output. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: ELF2 binaries launched directly.Rusty Russell2013-11-211-15/+35
|/ | | | | | | | | No function descriptor, but we set r12 up and set TIF_RESTOREALL as it normally isn't restored on return from syscall. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/tm: Remove interrupt disable in __switch_to()Michael Neuling2013-10-311-5/+2
| | | | | | | | | | | We currently turn IRQs off in __switch_to(0 but this is unnecessary as it's already disabled in the caller. This removes the IRQ disable but adds a check to make sure it is really off in case this changes in future. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: export debug registers save function for KVMBharat Bhushan2013-10-181-1/+2
| | | | | | | | | KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Scott Wood <scottwood@freescale.com>
* powerpc: move debug registers in a structureBharat Bhushan2013-10-181-21/+21
| | | | | | | | | | This way we can use same data type struct with KVM and also help in using other debug related function. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Acked-by: Michael Neuling <mikey@neuling.org> [scottwood@freescale.com: removed obvious debug_reg comment] Signed-off-by: Scott Wood <scottwood@freescale.com>
* powerpc: remove unnecessary line continuationsBharat Bhushan2013-10-181-1/+1
| | | | | | Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Scott Wood <scottwood@freescale.com>
* powerpc: Provide for giveup_fpu/altivec to save state in alternate locationPaul Mackerras2013-10-111-0/+7
| | | | | | | | | | | | | | | | | | | | | This provides a facility which is intended for use by KVM, where the contents of the FP/VSX and VMX (Altivec) registers can be saved away to somewhere other than the thread_struct when kernel code wants to use floating point or VMX instructions. This is done by providing a pointer in the thread_struct to indicate where the state should be saved to. The giveup_fpu() and giveup_altivec() functions test these pointers and save state to the indicated location if they are non-NULL. Note that the MSR_FP/VEC bits in task->thread.regs->msr are still used to indicate whether the CPU register state is live, even when an alternate save location is being used. This also provides load_fp_state() and load_vr_state() functions, which load up FP/VSX and VMX state from memory into the CPU registers, and corresponding store_fp_state() and store_vr_state() functions, which store FP/VSX and VMX state into memory from the CPU registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Put FP/VSX and VR state into structuresPaul Mackerras2013-10-111-5/+3
| | | | | | | | | | | | | | | | | | | | | | | This creates new 'thread_fp_state' and 'thread_vr_state' structures to store FP/VSX state (including FPSCR) and Altivec/VSX state (including VSCR), and uses them in the thread_struct. In the thread_fp_state, the FPRs and VSRs are represented as u64 rather than double, since we rarely perform floating-point computations on the values, and this will enable the structures to be used in KVM code as well. Similarly FPSCR is now a u64 rather than a structure of two 32-bit values. This takes the offsets out of the macros such as SAVE_32FPRS, REST_32FPRS, etc. This enables the same macros to be used for normal and transactional state, enabling us to delete the transactional versions of the macros. This also removes the unused do_load_up_fpu and do_load_up_altivec, which were in fact buggy since they didn't create large enough stack frames to account for the fact that load_up_fpu and load_up_altivec are not designed to be called from C and assume that their caller's stack frame is an interrupt frame. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Remove ksp_limit on ppc64Benjamin Herrenschmidt2013-09-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | We've been keeping that field in thread_struct for a while, it contains the "limit" of the current stack pointer and is meant to be used for detecting stack overflows. It has a few problems however: - First, it was never actually *used* on 64-bit. Set and updated but not actually exploited - When switching stack to/from irq and softirq stacks, it's update is racy unless we hard disable interrupts, which is costly. This is fine on 32-bit as we don't soft-disable there but not on 64-bit. Thus rather than fixing 2 in order to implement 1 in some hypothetical future, let's remove the code completely from 64-bit. In order to avoid a clutter of ifdef's, we remove the updates from C code completely during interrupt stack switching, and instead maintain it from the asm helper that is used to do the stack switching in the first place. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' into nextBenjamin Herrenschmidt2013-08-271-0/+10
|\ | | | | | | | | Merge stuff that already went into Linus via "merge" which are pre-reqs for subsequent patches
| * powerpc: Save the TAR register earlierMichael Neuling2013-08-091-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves us to save the Target Address Register (TAR) a earlier in __switch_to. It introduces a new function save_tar() to do this. We need to save the TAR earlier as we will overwrite it in the transactional memory reclaim/recheckpoint path. We are going to do this in a subsequent patch which will fix saving the TAR register when it's modified inside a transaction. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Make flush_fp_to_thread() nop when CONFIG_PPC_FPU is disabledKevin Hao2013-08-141-0/+2
|/ | | | | | | | | | | | | | | In the current kernel, the function flush_fp_to_thread() is not dependent on CONFIG_PPC_FPU. So most invocations of this function is not wrapped by CONFIG_PPC_FPU. Even through we don't really save the FPRs to the thread struct if CONFIG_PPC_FPU is not enabled, but there does have some runtime overhead such as the check for tsk->thread.regs and preempt disable and enable. It really make no sense to do that. So make it a nop when CONFIG_PPC_FPU is disabled. Also remove the wrapped #ifdef CONFIG_PPC_FPU when invoking this function. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge tag 'v3.10' into nextBenjamin Herrenschmidt2013-07-011-2/+2
|\ | | | | | | | | | | Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.
| * powerpc: Fix stack overflow crash in resume_kernel when ftracingMichael Ellerman2013-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf: Core EBB support for 64-bit book3sMichael Ellerman2013-07-011-0/+4
|/ | | | | | | | | | | | | | | | Add support for EBB (Event Based Branches) on 64-bit book3s. See the included documentation for more details. EBBs are a feature which allows the hardware to branch directly to a specified user space address when a PMU event overflows. This can be used by programs for self-monitoring with no kernel involvement in the inner loop. Most of the logic is in the generic book3s code, primarily to avoid a proliferation of PMU callbacks. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regressionMichael Neuling2013-06-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | When introducing support for DABRX in 4474ef0, we broke older 32-bit CPUs that don't have that register. Some CPUs have a DABR but not DABRX. Configuration are: - No 32bit CPUs have DABRX but some have DABR. - POWER4+ and below have the DABR but no DABRX. - 970 and POWER5 and above have DABR and DABRX. - POWER8 has DAWR, hence no DABRX. This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use the top 64 bits for CPU FTR bits since only 64 bit CPUs have this. Processors that don't have the DABRX will still work as they will fall back to software filtering these breakpoints via perf_exclude_event(). Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov> cc: stable@vger.kernel.org (v3.9 only) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/booke64: Fix kernel hangs at kernel_dbg_excScott Wood2013-05-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | MSR_DE is not cleared on entry to the kernel, and we don't clear it explicitly outside of debug code. If we have MSR_DE set in prime_debug_regs(), and the new thread has events enabled in DBCR0 (e.g. ICMP is set in thread->dbsr0, even though it was cleared in the real DBCR0 when the thread got scheduled out), we'll end up taking a debug exception in the kernel when DBCR0 is loaded. DSRR0 will not point to an exception vector, and the kernel ends up hanging at kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new debug state. Another observed source of kernel_dbg_exc hangs is with the branch taken event. If this event is active, but we take a non-debug trap (e.g. a TLB miss or an asynchronous interrupt) before the next branch. We end up taking a branch-taken debug exception on the initial branch instruction of the exception vector, but because the debug exception is DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even checking the DSRR0 address. Fix this by checking for DBSR_BT as well as DBSR_IC, which is what 32-bit does and what the comments suggest was intended in the 64-bit code as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning againLi Zhong2013-05-141-0/+1
| | | | | | | | | | | Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'next' of ↵Linus Torvalds2013-05-021-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlights this time around are: - A pile of addition POWER8 bits and nits, such as updated performance counter support (Michael Ellerman), new branch history buffer support (Anshuman Khandual), base support for the new PCI host bridge when not using the hypervisor (Gavin Shan) and other random related bits and fixes from various contributors. - Some rework of our page table format by Aneesh Kumar which fixes a thing or two and paves the way for THP support. THP itself will not make it this time around however. - More Freescale updates, including Altivec support on the new e6500 cores, new PCI controller support, and a pile of new boards support and updates. - The usual batch of trivial cleanups & fixes" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) powerpc: Fix build error for book3e powerpc: Context switch the new EBB SPRs powerpc: Turn on the EBB H/FSCR bits powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S powerpc: Setup BHRB instructions facility in HFSCR for POWER8 powerpc: Fix interrupt range check on debug exception powerpc: Update tlbie/tlbiel as per ISA doc powerpc: Print page size info during boot powerpc: print both base and actual page size on hash failure powerpc: Fix hpte_decode to use the correct decoding for page sizes powerpc: Decode the pte-lp-encoding bits correctly. powerpc: Use encode avpn where we need only avpn values powerpc: Reduce PTE table memory wastage powerpc: Move the pte free routines from common header powerpc: Reduce the PTE_INDEX_SIZE powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format powerpc: New hugepage directory format powerpc: Don't truncate pgd_index wrongly powerpc: Don't hard code the size of pte page powerpc: Save DAR and DSISR in pt_regs on MCE ...
| * Merge remote-tracking branch 'origin/master' into nextBenjamin Herrenschmidt2013-04-241-0/+2
| |\ | | | | | | | | | Merge upstream to get the audit fixes
| * | ptrace/powerpc: Don't flush_ptrace_hw_breakpoint() on fork()Oleg Nesterov2013-04-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_dup_task_struct() does flush_ptrace_hw_breakpoint(src), this destroys the parent's breakpoints for no reason. We should clear child->thread.ptrace_bps[] copied by dup_task_struct() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | dump_stack: unify debug information printed by show_regs()Tejun Heo2013-04-301-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | dump_stack: consolidate dump_stack() implementations and unify their behaviorsTejun Heo2013-04-301-6/+0
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both dump_stack() and show_stack() are currently implemented by each architecture. show_stack(NULL, NULL) dumps the backtrace for the current task as does dump_stack(). On some archs, dump_stack() prints extra information - pid, utsname and so on - in addition to the backtrace while the two are identical on other archs. The usages in arch-independent code of the two functions indicate show_stack(NULL, NULL) should print out bare backtrace while dump_stack() is used for debugging purposes when something went wrong, so it does make sense to print additional information on the task which triggered dump_stack(). There's no reason to require archs to implement two separate but mostly identical functions. It leads to unnecessary subtle information. This patch expands the dummy fallback dump_stack() implementation in lib/dump_stack.c such that it prints out debug information (taken from x86) and invokes show_stack(NULL, NULL) and drops arch-specific dump_stack() implementations in all archs except blackfin. Blackfin's dump_stack() does something wonky that I don't understand. Debug information can be printed separately by calling dump_stack_print_info() so that arch-specific dump_stack() implementation can still emit the same debug information. This is used in blackfin. This patch brings the following behavior changes. * On some archs, an extra level in backtrace for show_stack() could be printed. This is because the top frame was determined in dump_stack() on those archs while generic dump_stack() can't do that reliably. It can be compensated by inlining dump_stack() but not sure whether that'd be necessary. * Most archs didn't use to print debug info on dump_stack(). They do now. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Hardware name: empty Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a071>] init_workqueues+0x35/0x505 ... v2: CPU number added to the generic debug info as requested by s390 folks and dropped the s390 specific dump_stack(). This loses %ksp from the debug message which the maintainers think isn't important enough to keep the s390-specific dump_stack() implementation. dump_stack_print_info() is moved to kernel/printk.c from lib/dump_stack.c. Because linkage is per objecct file, dump_stack_print_info() living in the same lib file as generic dump_stack() means that archs which implement custom dump_stack() - at this point, only blackfin - can't use dump_stack_print_info() as that will bring in the generic version of dump_stack() too. v1 The v1 patch broke build on blackfin due to this issue. The build breakage was reported by Fengguang Wu. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: fix compiling CONFIG_PPC_TRANSACTIONAL_MEM when CONFIG_ALTIVEC=nMichael Neuling2013-04-101-0/+2
|/ | | | | | | | | | | | | | | | | | We can't compile a kernel with CONFIG_ALTIVEC=n when CONFIG_PPC_TRANSACTIONAL_MEM=y. We currently get: arch/powerpc/kernel/tm.S:320: Error: unsupported relocation against THREAD_VSCR arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0 arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0 etc. The below fixes this with a sprinkling of #ifdefs. This was found by mpe with kisskb: http://kisskb.ellerman.id.au/kisskb/buildresult/8539442/ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* powerpc: Hook in new transactional memory codeMichael Neuling2013-02-151-2/+13
| | | | | | | | | This hooks the new transactional memory code into context switching, FP/VMX/VMX unavailable and exception return. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Add reclaim and recheckpoint functions for context switching ↵Michael Neuling2013-02-151-0/+112
| | | | | | | | | | | | | | | | | | | | transactional memory processes When we switch out a task, we need to save both the checkpointed and the speculated state into the thread struct. Similarly when we are switching in a task we need to load both the checkpointed and speculated state. If the task was using FP, we non-lazily reload both the original and the speculative FP register states. This is because the kernel doesn't see if/when a TM rollback occurs, so if we take an FP unavoidable later, we are unable to determine which set of FP regs need to be restored. This simply adds these functions. It doesn't hook them into the existing code yet. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Add transactional memory paca scratch register to show_regsMichael Neuling2013-02-151-0/+3
| | | | | | | | | Add transactional memory paca scratch register to show_regs. This is useful for debugging. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: New macros for transactional memory supportMichael Neuling2013-02-151-0/+7
| | | | | | | | | | | | This adds new macros for saving and restoring checkpointed architected state from and to the thread_struct. It also adds some debugging macros for when your brain explodes trying to debug your transactional memory enabled kernel. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Add length setting to set_dawrMichael Neuling2013-01-291-1/+9
| | | | | | | | | | Currently we set the length field in the DAWR to 0 which defaults it to one double word (64bits) which is the same as the DABR. Change this so that we can set it to longer values as supported by the DAWR. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Rename set_break to avoid naming conflictMichael Neuling2013-01-161-3/+3
| | | | | | | | | | | | | | | With allmodconfig we are getting: drivers/tty/synclink_gt.c:160:12: error: conflicting types for 'set_break' arch/powerpc/include/asm/debug.h:49:5: note: previous declaration of 'set_break' was here drivers/tty/synclinkmp.c:526:12: error: conflicting types for 'set_break' arch/powerpc/include/asm/debug.h:49:5: note: previous declaration of 'set_break' was here This renames set_break to set_breakpoint to avoid this naming conflict Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Add the DAWR support to the set_break()Michael Neuling2013-01-101-0/+23
| | | | | | | | | | | | This adds DAWR supoprt to the set_break(). It does both bare metal and PAPR versions of setting the DAWR. There is still some work we can do to make full use of the watchpoint but that will come later. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registersMichael Neuling2013-01-101-21/+54
| | | | | | | | | | | | | This is a rewrite so that we don't assume we are using the DABR throughout the code. We now use the arch_hw_breakpoint to store the breakpoint in a generic manner in the thread_struct, rather than storing the raw DABR value. The ptrace GET/SET_DEBUGREG interface currently passes the raw DABR in from userspace. We keep this functionality, so that future changes (like the POWER8 DAWR), will still fake the DABR to userspace. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Define ppr in thread_structHaren Myneni2013-01-101-0/+2
| | | | | | | | | | | | | [PATCH 4/6] powerpc: Define ppr in thread_struct ppr in thread_struct is used to save PPR and restore it before process exits from kernel. This patch sets the default priority to 3 when tasks are created such that users can use 4 for higher priority tasks. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* flagday: don't pass regs to copy_thread()Al Viro2012-11-281-2/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: switch to generic fork/clone/vforkAl Viro2012-11-281-23/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: make fork_idle() take the common "kernel thread" path in copy_thread()Al Viro2012-10-211-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: put the "zero usp means using parent's stack pointer" to copy_thread()Al Viro2012-10-211-5/+4
| | | | | | simplifies callers, at that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: don't bother with CHECK_FULL_REGS in sys_fork() et.al.Al Viro2012-10-211-3/+0
| | | | | | copy_thread() will do it anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: don't bother with zero-extending arguments in sys_clone()Al Viro2012-10-211-8/+0
| | | | | | ... since the syscall glue had been doing that for 9 years already. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: take dereferencing to ret_from_kernel_thread()Al Viro2012-10-211-3/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: don't mess with r2 in copy_thread() and friendsAl Viro2012-10-141-2/+0
| | | | | | | | kernel_thread() callbacks are *not* in modules and are not going to be there. And it's not even read in ppc32 ret_from_kernel_thread(), so no need to bother with it there either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: switch to saner kernel_execve() semanticsAl Viro2012-10-141-10/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for-linus' of ↵Linus Torvalds2012-10-121-32/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull pile 2 of execve and kernel_thread unification work from Al Viro: "Stuff in there: kernel_thread/kernel_execve/sys_execve conversions for several more architectures plus assorted signal fixes and cleanups. There'll be more (in particular, real fixes for the alpha do_notify_resume() irq mess)..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (43 commits) alpha: don't open-code trace_report_syscall_{enter,exit} Uninclude linux/freezer.h m32r: trim masks avr32: trim masks tile: don't bother with SIGTRAP in setup_frame microblaze: don't bother with SIGTRAP in setup_rt_frame() mn10300: don't bother with SIGTRAP in setup_frame() frv: no need to raise SIGTRAP in setup_frame() x86: get rid of duplicate code in case of CONFIG_VM86 unicore32: remove pointless test h8300: trim _TIF_WORK_MASK parisc: decide whether to go to slow path (tracesys) based on thread flags parisc: don't bother looping in do_signal() parisc: fix double restarts bury the rest of TIF_IRET sanitize tsk_is_polling() bury _TIF_RESTORE_SIGMASK unicore32: unobfuscate _TIF_WORK_MASK mips: NOTIFY_RESUME is not needed in TIF masks mips: merge the identical "return from syscall" per-ABI code ... Conflicts: arch/arm/include/asm/thread_info.h
| * powerpc: switch to generic sys_execve()/kernel_execve()Al Viro2012-09-301-19/+6
| | | | | | | | | | | | | | | | the only non-obvious part is that current_pt_regs() is really needed here - task_pt_regs() is NULL for kernel threads; it's OK for ptrace uses (the thing task_pt_regs() is intended for), but not for us. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * powerpc: split ret_from_forkAl Viro2012-09-301-13/+21
| | | | | | | | | | | | ... and get rid of in-kernel syscalls in kernel_thread() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud