summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* x86: Be consistent with data size in getuser.SH. Peter Anvin2013-02-111-5/+5
| | | | | | | | | | Consistently use the data register by name and use a sized assembly instruction in getuser.S. There is never any reason to macroize it, and being inconsistent in the same file is just annoying. No actual code change. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mm: Use a bitfield to mask nuisance get_user() warningsH. Peter Anvin2013-02-111-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though it is never executed, gcc wants to warn for casting from a large integer to a pointer. Furthermore, using a variable with __typeof__() doesn't work because __typeof__ retains storage specifiers (const, restrict, volatile). However, we can declare a bitfield using sizeof(), which is legal because sizeof() is a constant expression. This quiets the warning, although the code generated isn't 100% identical from the baseline before 96477b4 x86-32: Add support for 64bit get_user(): [x86-mb is baseline, x86-mm is this commit] text data bss filename 113716147 15858380 35037184 tip.x86-mb/o.i386-allconfig/vmlinux 113716145 15858380 35037184 tip.x86-mm/o.i386-allconfig/vmlinux 12989837 3597944 12255232 tip.x86-mb/o.i386-modconfig/vmlinux 12989831 3597944 12255232 tip.x86-mm/o.i386-modconfig/vmlinux 1462784 237608 1401988 tip.x86-mb/o.i386-noconfig/vmlinux 1462837 237608 1401964 tip.x86-mm/o.i386-noconfig/vmlinux 7938994 553688 7639040 tip.x86-mb/o.i386-pae/vmlinux 7943136 557784 7639040 tip.x86-mm/o.i386-pae/vmlinux 7186126 510572 6574080 tip.x86-mb/o.i386/vmlinux 7186124 510572 6574080 tip.x86-mm/o.i386/vmlinux 103747269 33578856 65888256 tip.x86-mb/o.x86_64-allconfig/vmlinux 103746949 33578856 65888256 tip.x86-mm/o.x86_64-allconfig/vmlinux 12116695 11035832 20160512 tip.x86-mb/o.x86_64-modconfig/vmlinux 12116567 11035832 20160512 tip.x86-mm/o.x86_64-modconfig/vmlinux 1700790 380524 511808 tip.x86-mb/o.x86_64-noconfig/vmlinux 1700790 380524 511808 tip.x86-mm/o.x86_64-noconfig/vmlinux 12413612 1133376 1101824 tip.x86-mb/o.x86_64/vmlinux 12413484 1133376 1101824 tip.x86-mm/o.x86_64/vmlinux Cc: Jamie Lokier <jamie@shareable.org> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130209110031.GA17833@n2100.arm.linux.org.uk Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86/kvm: Fix compile warning in kvm_register_steal_time()Shuah Khan2013-02-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following compile warning in kvm_register_steal_time(): CC arch/x86/kernel/kvm.o arch/x86/kernel/kvm.c: In function ‘kvm_register_steal_time’: arch/x86/kernel/kvm.c:302:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Wformat] Introduced via: 5dfd486c4750 x86, kvm: Fix kvm's use of __pa() on percpu areas d76565344512 x86, mm: Create slow_virt_to_phys() f3c4fbb68e93 x86, mm: Use new pagetable helpers in try_preserve_large_page() 4cbeb51b860c x86, mm: Pagetable level size/shift/mask helpers a25b9316841c x86, mm: Make DEBUG_VIRTUAL work earlier in boot Signed-off-by: Shuah Khan <shuah.khan@hp.com> Acked-by: Gleb Natapov <gleb@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: shuahkhan@gmail.com Cc: avi@redhat.com Cc: gleb@redhat.com Cc: mst@redhat.com Link: http://lkml.kernel.org/r/1360119442.8356.8.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86-32: Add support for 64bit get_user()Ville Syrjälä2013-02-073-9/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement __get_user_8() for x86-32. It will return the 64-bit result in edx:eax register pair, and ecx is used to pass in the address and return the error value. For consistency, change the register assignment for all other __get_user_x() variants, so that address is passed in ecx/rcx, the error value is returned in ecx/rcx, and eax/rax contains the actual value. [ hpa: I modified the patch so that it does NOT change the calling conventions for the existing callsites, this also means that the code is completely unchanged for 64 bits. Instead, continue to use eax for address input/error output and use the ecx:edx register pair for the output. ] This is a partial refresh of a patch [1] by Jamie Lokier from 2004. Only the minimal changes to implement 64bit get_user() were picked from the original patch. [1] http://article.gmane.org/gmane.linux.kernel/198823 Originally-by: Jamie Lokier <jamie@shareable.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://lkml.kernel.org/r/1355312043-11467-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86-32, mm: Remove reference to alloc_remap()H. Peter Anvin2013-01-311-18/+11
| | | | | | | | | | | We have removed the remap allocator for x86-32, and x86-64 never had it (and doesn't need it). Remove residual reference to it. Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQVn6_QZi3fNQ-JHYiR-7jeDJ5hT0SyT_%2BzVvfOj=PzF3w@mail.gmail.com
* x86-32, mm: Remove reference to resume_map_numa_kva()H. Peter Anvin2013-01-312-8/+0
| | | | | | | | | Remove reference to removed function resume_map_numa_kva(). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com
* x86-32, mm: Rip out x86_32 NUMA remapping codeDave Hansen2013-01-314-174/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code was an optimization for 32-bit NUMA systems. It has probably been the cause of a number of subtle bugs over the years, although the conditions to excite them would have been hard to trigger. Essentially, we remap part of the kernel linear mapping area, and then sometimes part of that area gets freed back in to the bootmem allocator. If those pages get used by kernel data structures (say mem_map[] or a dentry), there's no big deal. But, if anyone ever tried to use the linear mapping for these pages _and_ cared about their physical address, bad things happen. For instance, say you passed __GFP_ZERO to the page allocator and then happened to get handed one of these pages, it zero the remapped page, but it would make a pte to the _old_ page. There are probably a hundred other ways that it could screw with things. We don't need to hang on to performance optimizations for these old boxes any more. All my 32-bit NUMA systems are long dead and buried, and I probably had access to more than most people. This code is causing real things to break today: https://lkml.org/lkml/2013/1/9/376 I looked in to actually fixing this, but it requires surgery to way too much brittle code, as well as stuff like per_cpu_ptr_to_phys(). [ hpa: Cc: this for -stable, since it is a memory corruption issue. However, an alternative is to simply mark NUMA as depends BROKEN rather than EXPERIMENTAL in the X86_32 subclause... ] Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
* x86/numa: Use __pa_nodebug() insteadBorislav Petkov2013-01-311-1/+1
| | | | | | | | | | | | ... and fix the following warning: arch/x86/mm/numa.c: In function ‘setup_node_data’: arch/x86/mm/numa.c:222:3: warning: passing argument 1 of ‘__phys_addr_nodebug’ makes integer from pointer without a cast Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1359245901-8512-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86, kvm: Fix kvm's use of __pa() on percpu areasDave Hansen2013-01-252-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In short, it is illegal to call __pa() on an address holding a percpu variable. This replaces those __pa() calls with slow_virt_to_phys(). All of the cases in this patch are in boot time (or CPU hotplug time at worst) code, so the slow pagetable walking in slow_virt_to_phys() is not expected to have a performance impact. The times when this actually matters are pretty obscure (certain 32-bit NUMA systems), but it _does_ happen. It is important to keep KVM guests working on these systems because the real hardware is getting harder and harder to find. This bug manifested first by me seeing a plain hang at boot after this message: CPU 0 irqstacks, hard=f3018000 soft=f301a000 or, sometimes, it would actually make it out to the console: [ 0.000000] BUG: unable to handle kernel paging request at ffffffff I eventually traced it down to the KVM async pagefault code. This can be worked around by disabling that code either at compile-time, or on the kernel command-line. The kvm async pagefault code was injecting page faults in to the guest which the guest misinterpreted because its "reason" was not being properly sent from the host. The guest passes a physical address of an per-cpu async page fault structure via an MSR to the host. Since __pa() is broken on percpu data, the physical address it sent was bascially bogus and the host went scribbling on random data. The guest never saw the real reason for the page fault (it was injected by the host), assumed that the kernel had taken a _real_ page fault, and panic()'d. The behavior varied, though, depending on what got corrupted by the bad write. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.com Acked-by: Rik van Riel <riel@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mm: Create slow_virt_to_phys()Dave Hansen2013-01-252-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | This is necessary because __pa() does not work on some kinds of memory, like vmalloc() or the alloc_remap() areas on 32-bit NUMA systems. We have some functions to do conversions _like_ this in the vmalloc() code (like vmalloc_to_page()), but they do not work on sizes other than 4k pages. We would potentially need to be able to handle all the page sizes that we use for the kernel linear mapping (4k, 2M, 1G). In practice, on 32-bit NUMA systems, the percpu areas get stuck in the alloc_remap() area. Any __pa() call on them will break and basically return garbage. This patch introduces a new function slow_virt_to_phys(), which walks the kernel page tables on x86 and should do precisely the same logical thing as __pa(), but actually work on a wider range of memory. It should work on the normal linear mapping, vmalloc(), kmap(), etc... Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212433.4D1FCA62@kernel.stglabs.ibm.com Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mm: Use new pagetable helpers in try_preserve_large_page()Dave Hansen2013-01-251-7/+4
| | | | | | | | | | try_preserve_large_page() can be slightly simplified by using the new page_level_*() helpers. This also moves the 'level' over to the new pg_level enum type. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212432.14F3D993@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mm: Pagetable level size/shift/mask helpersDave Hansen2013-01-252-1/+15
| | | | | | | | | | | | | | | | | | | | | I plan to use lookup_address() to walk the kernel pagetables in a later patch. It returns a "pte" and the level in the pagetables where the "pte" was found. The level is just an enum and needs to be converted to a useful value in order to do address calculations with it. These helpers will be used in at least two places. This also gives the anonymous enum a real name so that no one gets confused about what they should be passing in to these helpers. "PTE_SHIFT" was chosen for naming consistency with the other pagetable levels (PGD/PUD/PMD_SHIFT). Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212431.405D3A8C@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mm: Make DEBUG_VIRTUAL work earlier in bootDave Hansen2013-01-253-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | The KVM code has some repeated bugs in it around use of __pa() on per-cpu data. Those data are not in an area on which using __pa() is valid. However, they are also called early enough in boot that __vmalloc_start_set is not set, and thus the CONFIG_DEBUG_VIRTUAL debugging does not catch them. This adds a check to also verify __pa() calls against max_low_pfn, which we can use earler in boot than is_vmalloc_addr(). However, if we are super-early in boot, max_low_pfn=0 and this will trip on every call, so also make sure that max_low_pfn is set before we try to use it. With this patch applied, CONFIG_DEBUG_VIRTUAL will actually catch the bug I was chasing (and fix later in this series). I'd love to find a generic way so that any __pa() call on percpu areas could do a BUG_ON(), but there don't appear to be any nice and easy ways to check if an address is a percpu one. Anybody have ideas on a way to do this? Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212430.F46F8159@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge tag 'v3.8-rc5' into x86/mmH. Peter Anvin2013-01-25268-4017/+8209
|\ | | | | | | | | | | | | | | The __pa() fixup series that follows touches KVM code that is not present in the existing branch based on v3.7-rc5, so merge in the current upstream from Linus. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * Merge tag 'perf-urgent-for-mingo' of ↵Linus Torvalds2013-01-221-6/+0
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf/urgent fixes from Arnaldo Carvalho de Melo: . revert 20b279 - require exclude_guest to use PEBS - kernel side, now older binaries will continue working for things like cycles:pp without needing to pass extra modifiers, from David Ahern. . Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI, from Sebastian Andrzej Siewior [ Pulling directly, Ingo would normally pull but has been unresponsive ] * tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Fix building from 'make perf-*-src-pkg' tarballs perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
| | * perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel sideDavid Ahern2013-01-101-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is brought to you by the letter 'H'. Commit 20b279 breaks compatiblity with older perf binaries when run with precise modifier (:p or :pp) by requiring the exclude_guest attribute to be set. Older binaries default exclude_guest to 0 (ie., wanting guest-based samples) unless host only profiling is requested (:H modifier). The workaround for older binaries is to add H to the modifier list (e.g., -e cycles:ppH - toggles exclude_guest to 1). This was deemed unacceptable by Linus: https://lkml.org/lkml/2012/12/12/570 Between family in town and the fresh snow in Breckenridge there is no time left to be working on the proper fix for this over the holidays. In the New Year I have more pressing problems to resolve -- like some memory leaks in perf which are proving to be elusive -- although the aforementioned snow is probably why they are proving to be elusive. Either way I do not have any spare time to work on this and from the time I have managed to spend on it the solution is more difficult than just moving to a new exclude_guest flag (does not work) or flipping the logic to include_guest (which is not as trivial as one would think). So, two options: silently force exclude_guest on as suggested by Gleb which means no impact to older perf binaries or revert the original patch which caused the breakage. This patch does the latter -- reverts the original patch that introduced the regression. The problem can be revisited in the future as time allows. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILLOleg Nesterov2013-01-221-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | putreg() assumes that the tracee is not running and pt_regs_access() can safely play with its stack. However a killed tracee can return from ptrace_stop() to the low-level asm code and do RESTORE_REST, this means that debugger can actually read/modify the kernel stack until the tracee does SAVE_REST again. set_task_blockstep() can race with SIGKILL too and in some sense this race is even worse, the very fact the tracee can be woken up breaks the logic. As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace() call, this ensures that nobody can ever wakeup the tracee while the debugger looks at it. Not only this fixes the mentioned problems, we can do some cleanups/simplifications in arch_ptrace() paths. Probably ptrace_unfreeze_traced() needs more callers, for example it makes sense to make the tracee killable for oom-killer before access_process_vm(). While at it, add the comment into may_ptrace_stop() to explain why ptrace_stop() still can't rely on SIGKILL and signal_pending_state(). Reported-by: Salman Qazi <sqazi@google.com> Reported-by: Suleiman Souhlal <suleiman@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Merge tag 'stable/for-linus-3.8-rc3-tag' of ↵Linus Torvalds2013-01-182-8/+0
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels) - Fix racy vma access spotted by Al Viro - Fix mmap batch ioctl potentially resulting in large O(n) page allcations. - Fix vcpu online/offline BUG:scheduling while atomic.. - Fix unbound buffer scanning for more than 32 vCPUs. - Fix grant table being incorrectly initialized - Fix incorrect check in pciback - Allow privcmd in backend domains. Fix up whitespace conflict due to ugly merge resolution in Xen tree in arch/arm/xen/enlighten.c * tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic." xen/gntdev: remove erronous use of copy_to_user xen/gntdev: correctly unmap unlinked maps in mmu notifier xen/gntdev: fix unsafe vma access xen/privcmd: Fix mmap batch ioctl. Xen: properly bound buffer access when parsing cpu/*/availability xen/grant-table: correctly initialize grant table version 1 x86/xen : Fix the wrong check in pciback xen/privcmd: Relax access control in privcmd_ioctl_mmap
| | * | xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.Andrew Cooper2013-01-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes CVE-2013-0190 / XSA-40 There has been an error on the xen_failsafe_callback path for failed iret, which causes the stack pointer to be wrong when entering the iret_exc error path. This can result in the kernel crashing. In the classic kernel case, the relevant code looked a little like: popl %eax # Error code from hypervisor jz 5f addl $16,%esp jmp iret_exc # Hypervisor said iret fault 5: addl $16,%esp # Hypervisor said segment selector fault Here, there are two identical addls on either option of a branch which appears to have been optimised by hoisting it above the jz, and converting it to an lea, which leaves the flags register unaffected. In the PVOPS case, the code looks like: popl_cfi %eax # Error from the hypervisor lea 16(%esp),%esp # Add $16 before choosing fault path CFI_ADJUST_CFA_OFFSET -16 jz 5f addl $16,%esp # Incorrectly adjust %esp again jmp iret_exc It is possible unprivileged userspace applications to cause this behaviour, for example by loading an LDT code selector, then changing the code selector to be not-present. At this point, there is a race condition where it is possible for the hypervisor to return back to userspace from an interrupt, fault on its own iret, and inject a failsafe_callback into the kernel. This bug has been present since the introduction of Xen PVOPS support in commit 5ead97c84 (xen: Core Xen implementation), in 2.6.23. Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling ↵Konrad Rzeszutek Wilk2013-01-151-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while atomic." This reverts commit 41bd956de3dfdc3a43708fe2e0c8096c69064a1e. The fix is incorrect and not appropiate for the latest kernels. In fact it _causes_ the BUG: scheduling while atomic while doing vCPU hotplug. Suggested-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | Merge tag 'v3.7' into stable/for-linus-3.8Konrad Rzeszutek Wilk2013-01-1523-94/+226
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.7 * tag 'v3.7': (833 commits) Linux 3.7 Input: matrix-keymap - provide proper module license Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage ipv4: ip_check_defrag must not modify skb before unsharing Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" inet_diag: validate port comparison byte code to prevent unsafe reads inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() inet_diag: validate byte code to prevent oops in inet_diag_bc_run() inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state mm: vmscan: fix inappropriate zone congestion clearing vfs: fix O_DIRECT read past end of block device net: gro: fix possible panic in skb_gro_receive() tcp: bug fix Fast Open client retransmission tmpfs: fix shared mempolicy leak mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones mm: compaction: validate pfn range passed to isolate_freepages_block mmc: sh-mmcif: avoid oops on spurious interrupts (second try) Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts" mmc: sdhci-s3c: fix missing clock for gpio card-detect lib/Makefile: Fix oid_registry build dependency ... Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Conflicts: arch/arm/xen/enlighten.c drivers/xen/Makefile [We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3 and there are some patches in v3.7-rc6 that we to have in our branch]
| * | \ \ Merge branch 'x86/urgent' of ↵Linus Torvalds2013-01-162-1/+81
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "This is mainly a workaround for a bug in Sandy Bridge graphics which causes corruption of certain memory pages." * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI x86/Sandy Bridge: mark arrays in __init functions as __initconst x86/Sandy Bridge: reserve pages when integrated graphics is present x86, efi: correct precedence of operators in setup_efi_pci
| | * | | | x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCIH. Peter Anvin2013-01-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | early_pci_allowed() and read_pci_config_16() are only available if CONFIG_PCI is defined. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
| | * | | | x86/Sandy Bridge: mark arrays in __init functions as __initconstH. Peter Anvin2013-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark static arrays as __initconst so they get removed when the init sections are flushed. Reported-by: Mathias Krause <minipli@googlemail.com> Link: http://lkml.kernel.org/r/75F4BEE6-CB0E-4426-B40B-697451677738@googlemail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| | * | | | x86/Sandy Bridge: reserve pages when integrated graphics is presentJesse Barnes2013-01-111-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SNB graphics devices have a bug that prevent them from accessing certain memory ranges, namely anything below 1M and in the pages listed in the table. So reserve those at boot if set detect a SNB gfx device on the CPU to avoid GPU hangs. Stephane Marchesin had a similar patch to the page allocator awhile back, but rather than reserving pages up front, it leaked them at allocation time. [ hpa: made a number of stylistic changes, marked arrays as static const, and made less verbose; use "memblock=debug" for full verbosity. ] Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| | * | | | x86, efi: correct precedence of operators in setup_efi_pciSasha Levin2012-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the current code, the condition in the if() doesn't make much sense due to precedence of operators. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1356030701-16284-25-git-send-email-sasha.levin@oracle.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2013-01-102-8/+28
| |\ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: use dynamic percpu allocations for shared msrs area KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV powerpc: Corrected include header path in kvm_para.h Add rcu user eqs exception hooks for async page fault
| | * | | | KVM: x86: use dynamic percpu allocations for shared msrs areaMarcelo Tosatti2013-01-081-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dynamic percpu allocations for the shared msrs structure, to avoid using the limited reserved percpu space. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| | * | | | Add rcu user eqs exception hooks for async page faultLi Zhong2012-12-181-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds user eqs exception hooks for async page fault page not present code path, to exit the user eqs and re-enter it as necessary. Async page fault is different from other exceptions that it may be triggered from idle process, so we still need rcu_irq_enter() and rcu_irq_exit() to exit cpu idle eqs when needed, to protect the code that needs use rcu. As Frederic pointed out it would be safest and simplest to protect the whole kvm_async_pf_task_wait(). Otherwise, "we need to check all the code there deeply for potential RCU uses and ensure it will never be extended later to use RCU.". However, We'd better re-enter the cpu idle eqs if we get the exception in cpu idle eqs, by calling rcu_irq_exit() before native_safe_halt(). So the patch does what Frederic suggested for rcu_irq_*() API usage here, except that I moved the rcu_irq_*() pair originally in do_async_page_fault() into kvm_async_pf_task_wait(). That's because, I think it's better to have rcu_irq_*() pairs to be in one function ( rcu_irq_exit() after rcu_irq_enter() ), especially here, kvm_async_pf_task_wait() has other callers, which might cause rcu_irq_exit() be called without a matching rcu_irq_enter() before it, which is illegal if the cpu happens to be in rcu idle state. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | X86: drivers: remove __dev* attributes.Greg Kroah-Hartman2013-01-0321-89/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Drake <dsd@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | | PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)Myron Stowe2012-12-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 284f5f9 was intended to disable the "only_one_child()" optimization on Stratus ftServer systems, but its DMI check is wrong. It looks for DMI_SYS_VENDOR that contains "ftServer", when it should look for DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing "ftServer". Tested on Stratus ftServer 6400. Reported-by: Fadeeva Marina <astarta@rat.ru> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331 Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.5+
| * | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-12-2019-118/+24
| |\ \ \ \ \ | | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
| | * | | | new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to thoseAl Viro2012-12-193-25/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | note that they are relying on access_ok() already checked by caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | generic compat_sys_sigaltstack()Al Viro2012-12-198-67/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Again, conditional on CONFIG_GENERIC_SIGALTSTACK Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | introduce generic sys_sigaltstack(), switch x86 and um to itAl Viro2012-12-1910-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not select it are completely unaffected Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | new helper: compat_user_stack_pointer()Al Viro2012-12-191-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compat counterpart of current_user_stack_pointer(); for most of the biarch architectures those two are identical, but e.g. arm64 and arm use different registers for stack pointer... Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | unify SS_ONSTACK/SS_DISABLE definitionsAl Viro2012-12-191-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | missing user_stack_pointer() instancesAl Viro2012-12-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for the architectures that have usp in pt_regs and do not have user_stack_pointer() already defined. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | Bury the conditionals from kernel_thread/kernel_execve seriesAl Viro2012-12-193-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All architectures have CONFIG_GENERIC_KERNEL_THREAD CONFIG_GENERIC_KERNEL_EXECVE __ARCH_WANT_SYS_EXECVE None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers of kernel_execve() (which is a trivial wrapper for do_execve() now) left. Kill the conditionals and make both callers use do_execve(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | Merge tag 'iommu-updates-v3.8' of ↵Linus Torvalds2012-12-201-0/+1
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "A few new features this merge-window. The most important one is probably, that dma-debug now warns if a dma-handle is not checked with dma_mapping_error by the device driver. This requires minor changes to some architectures which make use of dma-debug. Most of these changes have the respective Acks by the Arch-Maintainers. Besides that there are updates to the AMD IOMMU driver for refactor the IOMMU-Groups support and to make sure it does not trigger a hardware erratum. The OMAP changes (for which I pulled in a branch from Tony Lindgren's tree) have a conflict in linux-next with the arm-soc tree. The conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in the arm-soc tree. It is safe to delete the file too so solve the conflict. Similar changes are done in the arm-soc tree in the common clock framework migration. A missing hunk from the patch in the IOMMU tree will be submitted as a seperate patch when the merge-window is closed." * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits) ARM: dma-mapping: support debug_dma_mapping_error ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks iommu/omap: Adapt to runtime pm iommu/omap: Migrate to hwmod framework iommu/omap: Keep mmu enabled when requested iommu/omap: Remove redundant clock handling on ISR iommu/amd: Remove obsolete comment iommu/amd: Don't use 512GB pages iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch iommu/tegra: gart: Move bus_set_iommu after probe for multi arch iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all tile: dma_debug: add debug_dma_mapping_error support sh: dma_debug: add debug_dma_mapping_error support powerpc: dma_debug: add debug_dma_mapping_error support mips: dma_debug: add debug_dma_mapping_error support microblaze: dma-mapping: support debug_dma_mapping_error ia64: dma_debug: add debug_dma_mapping_error support c6x: dma_debug: add debug_dma_mapping_error support ARM64: dma_debug: add debug_dma_mapping_error support intel-iommu: Prevent devices with RMRRs from being placed into SI Domain ...
| | | \ \ \ \
| | | \ \ \ \
| | | \ \ \ \
| | | \ \ \ \
| | | \ \ \ \
| | | \ \ \ \
| | *-----. \ \ \ \ Merge branches 'iommu/fixes', 'dma-debug', 'x86/amd', 'x86/vt-d', ↵Joerg Roedel2012-12-161-0/+1
| | |\ \ \ \ \ \ \ \ | | | |_|_|_|_|_|_|/ | | |/| | | | | | | | | | | | | | | | | 'arm/tegra' and 'arm/omap' into next
| | | * | | | | | | dma-debug: New interfaces to debug dma mapping errorsShuah Khan2012-10-241-0/+1
| | | |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add dma-debug interface debug_dma_mapping_error() to debug drivers that fail to check dma mapping errors on addresses returned by dma_map_single() and dma_map_page() interfaces. This interface clears a flag set by debug_dma_map_page() to indicate that dma_mapping_error() has been called by the driver. When driver does unmap, debug_dma_unmap() checks the flag and if this flag is still set, prints warning message that includes call trace that leads up to the unmap. This interface can be called from dma_mapping_error() routines to enable dma mapping error check debugging. Tested: Intel iommu and swiotlb (iommu=soft) on x86-64 with CONFIG_DMA_API_DEBUG enabled and disabled. Signed-off-by: Shuah Khan <shuah.khan@hp.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
| * | | | | | | | Merge branch 'x86/nuke386' of ↵Linus Torvalds2012-12-193-52/+1
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull one final 386 removal patch from Peter Anvin. IRQ 13 FPU error handling is gone. That was not one of the proudest moments in PC history. * 'x86/nuke386' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, 386 removal: Remove support for IRQ 13 FPU error reporting
| | * | | | | | | | x86, 386 removal: Remove support for IRQ 13 FPU error reportingH. Peter Anvin2012-12-173-52/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove support for FPU error reporting via IRQ 13, as opposed to exception 16 (#MF). One last remnant of i386 gone. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Alan Cox <alan@linux.intel.com>
| * | | | | | | | | Merge tag 'modules-next-for-linus' of ↵Linus Torvalds2012-12-192-0/+2
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd
| | * | | | | | | | | module: add syscall to load module from fdKees Cook2012-12-142-0/+2
| | | |/ / / / / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the effort to create a stronger boundary between root and kernel, Chrome OS wants to be able to enforce that kernel modules are being loaded only from our read-only crypto-hash verified (dm_verity) root filesystem. Since the init_module syscall hands the kernel a module as a memory blob, no reasoning about the origin of the blob can be made. Earlier proposals for appending signatures to kernel modules would not be useful in Chrome OS, since it would involve adding an additional set of keys to our kernel and builds for no good reason: we already trust the contents of our root filesystem. We don't need to verify those kernel modules a second time. Having to do signature checking on module loading would slow us down and be redundant. All we need to know is where a module is coming from so we can say yes/no to loading it. If a file descriptor is used as the source of a kernel module, many more things can be reasoned about. In Chrome OS's case, we could enforce that the module lives on the filesystem we expect it to live on. In the case of IMA (or other LSMs), it would be possible, for example, to examine extended attributes that may contain signatures over the contents of the module. This introduces a new syscall (on x86), similar to init_module, that has only two arguments. The first argument is used as a file descriptor to the module and the second argument is a pointer to the NULL terminated string of module arguments. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
| * | | | | | | | | Merge branch 'akpm' (more patches from Andrew)Linus Torvalds2012-12-181-10/+57
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge patches from Andrew Morton: "Most of the rest of MM, plus a few dribs and drabs. I still have quite a few irritating patches left around: ones with dubious testing results, lack of review, ones which should have gone via maintainer trees but the maintainers are slack, etc. I need to be more activist in getting these things wrapped up outside the merge window, but they're such a PITA." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits) mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() vmscan: comment too_many_isolated() mm/kmemleak.c: remove obsolete simple_strtoul mm/memory_hotplug.c: improve comments mm/hugetlb: create hugetlb cgroup file in hugetlb_init mm/mprotect.c: coding-style cleanups Documentation: ABI: /sys/devices/system/node/ slub: drop mutex before deleting sysfs entry memcg: add comments clarifying aspects of cache attribute propagation kmem: add slab-specific documentation about the kmem controller slub: slub-specific propagation changes slab: propagate tunable values memcg: aggregate memcg cache values in slabinfo memcg/sl[au]b: shrink dead caches memcg/sl[au]b: track all the memcg children of a kmem_cache memcg: destroy memcg caches sl[au]b: allocate objects from memcg cache sl[au]b: always get the cache from its page in kmem_cache_free() memcg: skip memcg kmem allocations in specified code regions memcg: infrastructure to match an allocation to the right cache ...
| | * | | | | | | | | arch/x86/platform/iris/iris.c: register a platform device and a platform driverShérab2012-12-181-10/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the iris driver use the platform API, so it is properly exposed in /sys. [akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | Merge branch 'release' of ↵Linus Torvalds2012-12-181-0/+37
| |\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull powertool update from Len Brown: "This updates the tree w/ the latest version of turbostat, which reports temperature and - on SNB and later - Watts." Fix up semantic merge conflict as per Len. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools: Allow tools to be installed in a user specified location tools/power: turbostat: make Makefile a bit more capable tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu() tools/power turbostat: v3.0: monitor Watts and Temperature tools/power turbostat: fix output buffering issue tools/power turbostat: prevent infinite loop on migration error path x86 power: define RAPL MSRs tools/power/x86/turbostat: share kernel MSR #defines
| | * | | | | | | | | | x86 power: define RAPL MSRsLen Brown2012-11-231-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Run Time Average Power Limiting interface is currently model specific, present on Sandy Bridge and Ivy Bridge processors. These #defines correspond to documentation in the latest "Intel® 64 and IA-32 Architectures Software Developer Manual", plus some typos in that document corrected. Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
OpenPOWER on IntegriCloud