summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/signal.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2015-04-181-11/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This tree includes: - an FPU related crash fix - a ptrace fix (with matching testcase in tools/testing/selftests/) - an x86 Kconfig DMA-config defaults tweak to better avoid non-working drivers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected x86/fpu: Load xsave pointer *after* initialization x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() x86, selftests: Add single_step_syscall test
| * x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal()Oleg Nesterov2015-04-161-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'exec_domain_rip_v2' of ↵Linus Torvalds2015-04-151-15/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc Pull exec domain removal from Richard Weinberger: "This series removes execution domain support from Linux. The idea behind exec domains was to support different ABIs. The feature was never complete nor stable. Let's rip it out and make the kernel signal handling code less complicated" * 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits) arm64: Removed unused variable sparc: Fix execution domain removal Remove rest of exec domains. arch: Remove exec_domain from remaining archs arc: Remove signal translation and exec_domain xtensa: Remove signal translation and exec_domain xtensa: Autogenerate offsets in struct thread_info x86: Remove signal translation and exec_domain unicore32: Remove signal translation and exec_domain um: Remove signal translation and exec_domain tile: Remove signal translation and exec_domain sparc: Remove signal translation and exec_domain sh: Remove signal translation and exec_domain s390: Remove signal translation and exec_domain mn10300: Remove signal translation and exec_domain microblaze: Remove signal translation and exec_domain m68k: Remove signal translation and exec_domain m32r: Remove signal translation and exec_domain m32r: Autogenerate offsets in struct thread_info frv: Remove signal translation and exec_domain ...
| * x86: Remove signal translation and exec_domainRichard Weinberger2015-04-121-15/+1
| | | | | | | | | | | | | | | | As execution domain support is gone we can remove signal translation from the signal code and remove exec_domain from thread_info. Signed-off-by: Richard Weinberger <richard@nod.at>
* | Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds2015-04-131-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu changes from Ingo Molnar: "Various x86 FPU handling cleanups, refactorings and fixes (Borislav Petkov, Oleg Nesterov, Rik van Riel)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fpu: Kill eager_fpu_init_bp() x86/fpu: Don't allocate fpu->state for swapper/0 x86/fpu: Rename drop_init_fpu() to fpu_reset_state() x86/fpu: Fold __drop_fpu() into its sole user x86/fpu: Don't abuse drop_init_fpu() in flush_thread() x86/fpu: Use restore_init_xstate() instead of math_state_restore() on kthread exec x86/fpu: Introduce restore_init_xstate() x86/fpu: Document user_fpu_begin() x86/fpu: Factor out memset(xstate, 0) in fpu_finit() paths x86/fpu: Change xstateregs_get()/set() to use ->xsave.i387 rather than ->fxsave x86/fpu: Don't abuse FPU in kernel threads if use_eager_fpu() x86/fpu: Always allow FPU in interrupt if use_eager_fpu() x86/fpu: __kernel_fpu_begin() should clear fpu_owner_task even if use_eager_fpu() x86/fpu: Also check fpu_lazy_restore() when use_eager_fpu() x86/fpu: Use task_disable_lazy_fpu_restore() helper x86/fpu: Use an explicit if/else in switch_fpu_prepare() x86/fpu: Introduce task_disable_lazy_fpu_restore() helper x86/fpu: Move lazy restore functions up a few lines x86/fpu: Change math_error() to use unlazy_fpu(), kill (now) unused save_init_fpu() x86/fpu: Don't do __thread_fpu_end() if use_eager_fpu() ...
| * | x86/fpu: Rename drop_init_fpu() to fpu_reset_state()Borislav Petkov2015-03-231-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | Call it what it does and in accordance with the context where it is used: we reset the FPU state either because we were unable to restore it from the one saved in the task or because we simply want to reset it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/signal: Remove pax argument from restore_sigcontextBrian Gerst2015-04-061-14/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'pax' argument is unnecesary. Instead, store the RAX value directly in regs. This pattern goes all the way back to 2.1.106pre1, when restore_sigcontext() was changed to return an error code instead of EAX directly: https://git.kernel.org/cgit/linux/kernel/git/history/history.git/diff/arch/i386/kernel/signal.c?id=9a8f8b7ca3f319bd668298d447bdf32730e51174 In 2007 sigaltstack syscall support was added, where the return value of restore_sigcontext() was changed to carry the memory-copying failure code. But instead of putting 'ax' into regs->ax directly, it was carried in via a pointer and then returned, where the generic syscall return code copied it to regs->ax. So there was never any deeper reason for this suboptimal pattern, it was simply never noticed after being introduced. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1428152303-17154-1-git-send-email-brgerst@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRETBrian Gerst2015-03-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both the execve() and sigreturn() family of syscalls have the ability to change registers in ways that may not be compatabile with the syscall path they were called from. In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss, and some bits in eflags. These syscalls have stubs that are hardcoded to jump to the IRET path, and not return to the original syscall path. The following commit: 76f5df43cab5e76 ("Always allocate a complete "struct pt_regs" on the kernel stack") recently changed this for some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit program to a 64-bit program would fail because it still returned via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit. This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so that the IRET path is always taken on exit to userspace. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@gmail.com [ Improved the changelog and comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/signal/64: Remove 'fs' and 'gs' from sigcontextAndy Lutomirski2015-03-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As far as I can tell, these fields have been set to zero on save and ignored on restore since Linux was imported into git. Rename them '__pad1' and '__pad2' to avoid confusion. This may also allow us to recycle them some day. This also adds a comment clarifying the history of those fields. I'm intentionally avoiding calling either of them '__pad0': the field formerly known as '__pad0' is now 'ss'. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/844f8490e938780c03355be4c9b69eb4c494bf4e.1426193719.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/signal/64: Fix SS handling for signals delivered to 64-bit programsAndy Lutomirski2015-03-171-9/+13
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment in the signal code says that apps can save/restore other segments on their own. It's true that apps can *save* SS on their own, but there's no way for apps to restore it: SYSCALL effectively resets SS to __USER_DS, so any value that user code tries to load into SS gets lost on entry to sigreturn. This recycles two padding bytes in the segment selector area for SS. While we're at it, we need a second change to make this useful. If the signal we're delivering is caused by a bad SS value, saving that value isn't enough. We need to remove that bad value from the regs before we try to deliver the signal. Oddly, the i386 code already got this right. I suspect that 64-bit programs that try to run 16-bit code and use signals will have a lot of trouble without this. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* all arches, signal: move restart_block to struct task_structAndy Lutomirski2015-02-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an attacker can cause a controlled kernel stack overflow, overwriting the restart block is a very juicy exploit target. This is because the restart_block is held in the same memory allocation as the kernel stack. Moving the restart block to struct task_struct prevents this exploit by making the restart_block harder to locate. Note that there are other fields in thread_info that are also easy targets, at least on some architectures. It's also a decent simplification, since the restart code is more or less identical on all architectures. [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack] Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: David Miller <davem@davemloft.net> Acked-by: Richard Weinberger <richard@nod.at> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Steven Miao <realmz6@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricksLuck, Tony2015-01-071-6/+0
| | | | | | | | | | We now switch to the kernel stack when a machine check interrupts during user mode. This means that we can perform recovery actions in the tail of do_machine_check() Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
* x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()Oleg Nesterov2014-09-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | save_xstate_sig()->drop_init_fpu() doesn't look right. setup_rt_frame() can fail after that, in this case the next setup_rt_frame() triggered by SIGSEGV won't save fpu simply because the old state was lost. This obviously mean that fpu won't be restored after sys_rt_sigreturn() from SIGSEGV handler. Shift drop_init_fpu() into !failed branch in handle_signal(). Test-case (needs -O2): #include <stdio.h> #include <signal.h> #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> #include <pthread.h> #include <assert.h> volatile double D; void test(double d) { int pid = getpid(); for (D = d; D == d; ) { /* sys_tkill(pid, SIGHUP); asm to avoid save/reload * fp regs around "C" call */ asm ("" : : "a"(200), "D"(pid), "S"(1)); asm ("syscall" : : : "ax"); } printf("ERR!!\n"); } void sigh(int sig) { } char altstack[4096 * 10] __attribute__((aligned(4096))); void *tfunc(void *arg) { for (;;) { mprotect(altstack, sizeof(altstack), PROT_READ); mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE); } } int main(void) { stack_t st = { .ss_sp = altstack, .ss_size = sizeof(altstack), .ss_flags = SS_ONSTACK, }; struct sigaction sa = { .sa_handler = sigh, }; pthread_t pt; sigaction(SIGSEGV, &sa, NULL); sigaltstack(&st, NULL); sa.sa_flags = SA_ONSTACK; sigaction(SIGHUP, &sa, NULL); pthread_create(&pt, NULL, tfunc, NULL); test(123.456); return 0; } Reported-by: Bean Anderson <bean@azulsystems.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175713.GA21646@redhat.com Cc: <stable@kernel.org> # v3.7+ Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86_32, signal: Fix vdso rt_sigreturnAndy Lutomirski2014-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | This commit: commit 6f121e548f83674ab4920a4e60afb58d4f61b829 Author: Andy Lutomirski <luto@amacapital.net> Date: Mon May 5 12:19:34 2014 -0700 x86, vdso: Reimplement vdso.so preparation in build-time C Contained this obvious typo: - restorer = VDSO32_SYMBOL(current->mm->context.vdso, rt_sigreturn); + restorer = current->mm->context.vdso + + selected_vdso32->sym___kernel_sigreturn; Note the missing 'rt_' in the new code. Fix it. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/1eb40ad923acde2e18357ef2832867432e70ac42.1403361010.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, vdso: Reimplement vdso.so preparation in build-time CAndy Lutomirski2014-05-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, vdso.so files are prepared and analyzed by a combination of objcopy, nm, some linker script tricks, and some simple ELF parsers in the kernel. Replace all of that with plain C code that runs at build time. All five vdso images now generate .c files that are compiled and linked in to the kernel image. This should cause only one userspace-visible change: the loaded vDSO images are stripped more heavily than they used to be. Everything outside the loadable segment is dropped. In particular, this causes the section table and section name strings to be missing. This should be fine: real dynamic loaders don't load or inspect these tables anyway. The result is roughly equivalent to eu-strip's --strip-sections option. The purpose of this change is to enable the vvar and hpet mappings to be moved to the page following the vDSO load segment. Currently, it is possible for the section table to extend into the page after the load segment, so, if we map it, it risks overlapping the vvar or hpet page. This happens whenever the load segment is just under a multiple of PAGE_SIZE. The only real subtlety here is that the old code had a C file with inline assembler that did 'call VDSO32_vsyscall' and a linker script that defined 'VDSO32_vsyscall = __kernel_vsyscall'. This most likely worked by accident: the linker script entry defines a symbol associated with an address as opposed to an alias for the real dynamic symbol __kernel_vsyscall. That caused ld to relocate the reference at link time instead of leaving an interposable dynamic relocation. Since the VDSO32_vsyscall hack is no longer needed, I now use 'call __kernel_vsyscall', and I added -Bsymbolic to make it work. vdso2c will generate an error and abort the build if the resulting image contains any dynamic relocations, so we won't silently generate bad vdso images. (Dynamic relocations are a problem because nothing will even attempt to relocate the vdso.) Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/2c4fcf45524162a34d87fdda1eb046b2a5cecee7.1399317206.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge branch 'x86-smap-for-linus' of ↵Linus Torvalds2013-09-041-3/+3
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SMAP fixes from Ingo Molnar: "Fixes for Intel SMAP support, to fix SIGSEGVs during bootup" * 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Introduce [compat_]save_altstack_ex() to unbreak x86 SMAP x86, smap: Handle csum_partial_copy_*_user()
| * Introduce [compat_]save_altstack_ex() to unbreak x86 SMAPAl Viro2013-09-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For performance reasons, when SMAP is in use, SMAP is left open for an entire put_user_try { ... } put_user_catch(); block, however, calling __put_user() in the middle of that block will close SMAP as the STAC..CLAC constructs intentionally do not nest. Furthermore, using __put_user() rather than put_user_ex() here is bad for performance. Thus, introduce new [compat_]save_altstack_ex() helpers that replace __[compat_]save_altstack() for x86, being currently the only architecture which supports put_user_try { ... } put_user_catch(). Reported-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.8+ Link: http://lkml.kernel.org/n/tip-es5p6y64if71k8p5u08agv9n@git.kernel.org
* | x86, asmlinkage: Make several variables used from assembler/linker script ↵Andi Kleen2013-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | visible Plus one function, load_gs_index(). Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1375740170-7446-10-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | x86, asmlinkage: Make various syscalls asmlinkageAndi Kleen2013-08-061-2/+2
|/ | | | | | | | | | FWIW I suspect sys_rt_sigreturn/sys_sigreturn should use standard SYSCALL wrappers. But I didn't do that change in this patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1375740170-7446-7-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86/signals: Merge EFLAGS bit clearing into a single statementJiri Olsa2013-05-281-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Merging EFLAGS bit clearing into a single statement, to ensure EFLAGS bits are being cleared in a single instruction. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: Vince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-4-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/signals: Clear RF EFLAGS bit for signal handlerJiri Olsa2013-05-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clearing RF EFLAGS bit for signal handler. The reason is that this flag is set by debug exception code to prevent the recursive exception entry. Leaving it set for signal handler might prevent debug exception of the signal handler itself. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: Vince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-3-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/signals: Propagate RF EFLAGS bit through the signal restore callJiri Olsa2013-05-281-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While porting Vince's perf overflow tests I found perf event breakpoint overflow does not work properly. I found the x86 RF EFLAG bit not being set when returning from debug exception after triggering signal handler. Which is exactly what you get when you set perf breakpoint overflow SIGIO handler. This patch and the next two patches fix the underlying bugs. This patch adds the RF EFLAGS bit to be restored on return from signal from the original register context before the signal was entered. This will prevent the RF flag to disappear when returning from exception due to the signal handler being executed. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: Vince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-2-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86: convert to ksignalAl Viro2013-02-141-63/+54
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* x86: switch to generic old sigactionAl Viro2013-02-031-47/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* x86,um: switch to generic old sigsuspend()Al Viro2013-02-031-11/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* x86: get rid of pt_regs argument in sigreturn variantsAl Viro2013-02-031-3/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for-linus' of ↵Linus Torvalds2012-12-201-24/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
| * new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to thoseAl Viro2012-12-191-14/+4
| | | | | | | | | | | | note that they are relying on access_ok() already checked by caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * generic compat_sys_sigaltstack()Al Viro2012-12-191-3/+1
| | | | | | | | | | | | Again, conditional on CONFIG_GENERIC_SIGALTSTACK Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * introduce generic sys_sigaltstack(), switch x86 and um to itAl Viro2012-12-191-7/+0
| | | | | | | | | | | | | | Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not select it are completely unaffected Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'rcu/next' of ↵Ingo Molnar2012-12-031-2/+3
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Conflicts: arch/x86/kernel/ptrace.c Pull the latest RCU tree from Paul E. McKenney: " The major features of this series are: 1. A first version of no-callbacks CPUs. This version prohibits offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y. Relaxing this constraint is in progress, but not yet ready for prime time. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/724, and are at branch rcu/nocb. 2. Changes to SRCU that allows statically initialized srcu_struct structures. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/296, and are at branch rcu/srcu. 3. Restructuring of RCU's debugfs output. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/341, and are at branch rcu/tracing. 4. Additional CPU-hotplug/RCU improvements, posted to LKML at https://lkml.org/lkml/2012/10/30/327, and are at branch rcu/hotplug. Note that the commit eliminating __stop_machine() was judged to be too-high of risk, so is deferred to 3.9. 5. Changes to RCU's idle interface, most notably a new module parameter that redirects normal grace-period operations to their expedited equivalents. These were posted to LKML at https://lkml.org/lkml/2012/10/30/739, and are at branch rcu/idle. 6. Additional diagnostics for RCU's CPU stall warning facility, posted to LKML at https://lkml.org/lkml/2012/10/30/315, and are at branch rcu/stall. The most notable change reduces the default RCU CPU stall-warning time from 60 seconds to 21 seconds, so that it once again happens sooner than the softlockup timeout. 7. Documentation updates, which were posted to LKML at https://lkml.org/lkml/2012/10/30/280, and are at branch rcu/doc. A couple of late-breaking changes were posted at https://lkml.org/lkml/2012/11/16/634 and https://lkml.org/lkml/2012/11/16/547. 8. Miscellaneous fixes, which were posted to LKML at https://lkml.org/lkml/2012/10/30/309, along with a late-breaking change posted at Fri, 16 Nov 2012 11:26:25 -0800 with message-ID <20121116192625.GA447@linux.vnet.ibm.com>, but which lkml.org seems to have missed. These are at branch rcu/fixes. 9. Finally, a fix for an lockdep-RCU splat was posted to LKML at https://lkml.org/lkml/2012/11/7/486. This is at rcu/next. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * context_tracking: New context tracking susbsystemFrederic Weisbecker2012-11-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a new subsystem that probes on kernel boundaries to keep track of the transitions between level contexts with two basic initial contexts: user or kernel. This is an abstraction of some RCU code that use such tracking to implement its userspace extended quiescent state. We need to pull this up from RCU into this new level of indirection because this tracking is also going to be used to implement an "on demand" generic virtual cputime accounting. A necessary step to shutdown the tick while still accounting the cputime. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> [ paulmck: fix whitespace error and email address. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | Merge branch 'uprobes/core' of ↵Ingo Molnar2012-10-211-3/+1
|\ \ | |/ |/| | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/urgent Pull various uprobes bugfixes from Oleg Nesterov - mostly race and failure path fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * uprobes: Move clear_thread_flag(TIF_UPROBE) to uprobe_notify_resume()Oleg Nesterov2012-09-291-3/+1
| | | | | | | | | | | | | | | | Move clear_thread_flag(TIF_UPROBE) from do_notify_resume() to uprobe_notify_resume() for !CONFIG_UPROBES case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-10-101-4/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull generic execve() changes from Al Viro: "This introduces the generic kernel_thread() and kernel_execve() functions, and switches x86, arm, alpha, um and s390 over to them." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (26 commits) s390: convert to generic kernel_execve() s390: switch to generic kernel_thread() s390: fold kernel_thread_helper() into ret_from_fork() s390: fold execve_tail() into start_thread(), convert to generic sys_execve() um: switch to generic kernel_thread() x86, um/x86: switch to generic sys_execve and kernel_execve x86: split ret_from_fork alpha: introduce ret_from_kernel_execve(), switch to generic kernel_execve() alpha: switch to generic kernel_thread() alpha: switch to generic sys_execve() arm: get rid of execve wrapper, switch to generic execve() implementation arm: optimized current_pt_regs() arm: introduce ret_from_kernel_execve(), switch to generic kernel_execve() arm: split ret_from_fork, simplify kernel_thread() [based on patch by rmk] generic sys_execve() generic kernel_execve() new helper: current_pt_regs() preparation for generic kernel_thread() um: kill thread->forking um: let signal_delivered() do SIGTRAP on singlestepping into handler ...
| * | x86: get rid of TIF_IRET hackeryAl Viro2012-09-201-4/+0
| |/ | | | | | | | | | | | | | | | | | | TIF_NOTIFY_RESUME will work in precisely the same way; all that is achieved by TIF_IRET is appearing that there's some work to be done, so we end up on the iret exit path. Just use NOTIFY_RESUME. And for execve() do that in 32bit start_thread(), not sys_execve() itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'x86-smap-for-linus' of ↵Linus Torvalds2012-10-011-10/+14
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/smap support from Ingo Molnar: "This adds support for the SMAP (Supervisor Mode Access Prevention) CPU feature on Intel CPUs: a hardware feature that prevents unintended user-space data access from kernel privileged code. It's turned on automatically when possible. This, in combination with SMEP, makes it even harder to exploit kernel bugs such as NULL pointer dereferences." Fix up trivial conflict in arch/x86/kernel/entry_64.S due to newly added includes right next to each other. * 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, smep, smap: Make the switching functions one-way x86, suspend: On wakeup always initialize cr4 and EFER x86-32: Start out eflags and cr4 clean x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry x86, smap: Reduce the SMAP overhead for signal handling x86, smap: A page fault due to SMAP is an oops x86, smap: Turn on Supervisor Mode Access Prevention x86, smap: Add STAC and CLAC instructions to control user space access x86, uaccess: Merge prototypes for clear_user/__clear_user x86, smap: Add a header file with macros for STAC/CLAC x86, alternative: Add header guards to <asm/alternative-asm.h> x86, alternative: Use .pushsection/.popsection x86, smap: Add CR4 bit for SMAP x86-32, mm: The WP test should be done on a kernel page
| * \ Merge branch 'x86/fpu' into x86/smapH. Peter Anvin2012-09-211-124/+91
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reason for merge: x86/fpu changed the structure of some of the code that x86/smap changes; mostly fpu-internal.h but also minor changes to the signal code. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Resolved Conflicts: arch/x86/ia32/ia32_signal.c arch/x86/include/asm/fpu-internal.h arch/x86/kernel/signal.c
| * | | x86, smap: Reduce the SMAP overhead for signal handlingH. Peter Anvin2012-09-211-10/+14
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signal handling contains a bunch of accesses to individual user space items, which causes an excessive number of STAC and CLAC instructions. Instead, let get/put_user_try ... get/put_user_catch() contain the STAC and CLAC instructions. This means that get/put_user_try no longer nests, and furthermore that it is no longer legal to use user space access functions other than __get/put_user_ex() inside those blocks. However, these macros are x86-specific anyway and are only used in the signal-handling paths; a simple reordering of moving the larger subroutine calls out of the try...catch blocks resolves that problem. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-12-git-send-email-hpa@linux.intel.com
* | | Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds2012-10-011-122/+89
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu update from Ingo Molnar: "The biggest change is the addition of the non-lazy (eager) FPU saving support model and enabling it on CPUs with optimized xsaveopt/xrstor FPU state saving instructions. There are also various Sparse fixes" Fix up trivial add-add conflict in arch/x86/kernel/traps.c * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: fix kvm's usage of kernel_fpu_begin/end() x86, fpu: remove cpu_has_xmm check in the fx_finit() x86, fpu: make eagerfpu= boot param tri-state x86, fpu: enable eagerfpu by default for xsaveopt x86, fpu: decouple non-lazy/eager fpu restore from xsave x86, fpu: use non-lazy fpu restore for processors supporting xsave lguest, x86: handle guest TS bit for lazy/non-lazy fpu host models x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage x86, kvm: use kernel_fpu_begin/end() in kvm_load/put_guest_fpu() x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig() x86, fpu: drop_fpu() before restoring new state from sigframe x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels x86, fpu: Consolidate inline asm routines for saving/restoring fpu state x86, signal: Cleanup ifdefs and is_ia32, is_x32
| * | x86, fpu: Unify signal handling code paths for x86 and x86_64 kernelsSuresh Siddha2012-09-181-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | x86, signal: Cleanup ifdefs and is_ia32, is_x32Suresh Siddha2012-09-181-115/+81
| |/ | | | | | | | | | | | | | | | | | | | | | | Use config_enabled() to cleanup the definitions of is_ia32/is_x32. Move the function prototypes to the header file to cleanup ifdefs, and move the x32_setup_rt_frame() code around. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-2-git-send-email-suresh.b.siddha@intel.com Merged in compilation fix from, Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | x86: Exit RCU extended QS on notify resumeFrederic Weisbecker2012-09-261-0/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_notify_resume() may be called on irq or exception exit. But at that time the exception has already called rcu_user_enter() and the irq has already called rcu_irq_exit(). Since it can use RCU read side critical section, we must call rcu_user_exit() before doing anything there. Then we must call back rcu_user_enter() after this function because we know we are going to userspace from there. This complete support for userspace RCU extended quiescent state in x86-64. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>Joe Perches2012-06-061-1/+4
| | | | | | | | | | | | | | | | | | | Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: Joe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86: get rid of calling do_notify_resume() when returning to kernel modeAl Viro2012-06-011-10/+0
| | | | | | | | | | | | | | | | | | If we end up calling do_notify_resume() with !user_mode(refs), it does nothing (do_signal() explicitly bails out and we can't get there with TIF_NOTIFY_RESUME in such situations). Then we jump to resume_userspace_sig, which rechecks the same thing and bails out to resume_kernel, thus breaking the loop. It's easier and cheaper to check *before* calling do_notify_resume() and bail out to resume_kernel immediately. And kill the check in do_signal()... Note that on amd64 we can't get there with !user_mode() at all - asm glue takes care of that. Acked-and-reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: signal_delivered()Al Viro2012-06-011-4/+2
| | | | | | | | | | | | | | Does block_sigmask() + tracehook_signal_handler(); called when sigframe has been successfully built. All architectures converted to it; block_sigmask() itself is gone now (merged into this one). I'm still not too happy with the signature, but that's a separate story (IMO we need a structure that would contain signal number + siginfo + k_sigaction, so that get_signal_to_deliver() would fill one, signal_delivered(), handle_signal() and probably setup...frame() - take one). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from setAl Viro2012-06-011-3/+0
| | | | | | | | Only 3 out of 63 do not. Renamed the current variant to __set_current_blocked(), added set_current_blocked() that will exclude unblockable signals, switched open-coded instances to it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* pull clearing RESTORE_SIGMASK into block_sigmask()Al Viro2012-06-011-22/+9
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: sigmask_to_save()Al Viro2012-06-011-4/+1
| | | | | | | replace boilerplate "should we use ->saved_sigmask or ->blocked?" with calls of obvious inlined helper... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: restore_saved_sigmask()Al Viro2012-06-011-4/+1
| | | | | | | | | first fruits of ..._restore_sigmask() helpers: now we can take boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK and restore the blocked mask from ->saved_mask" into a common helper. Open-coded instances switched... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud