summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds2016-03-152-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...
| * arch/hotplug: Call into idle with a proper stateThomas Gleixner2016-03-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the non boot cpus call into idle with the corresponding hotplug state, so the hotplug core can handle the further bringup. That's a first step to convert the boot side of the hotplugged cpus to do all the synchronization with the other side through the state machine. For now it'll only start the hotplug thread and kick the full bringup of the cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Rik van Riel <riel@redhat.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'irq-core-for-linus' of ↵Linus Torvalds2016-03-153-57/+63
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The 4.6 pile of irq updates contains: - Support for IPI irqdomains to support proper integration of IPIs to and from coprocessors. The first user of this new facility is MIPS. The relevant MIPS patches come with the core to avoid merge ordering issues and have been acked by Ralf. - A new command line option to set the default interrupt affinity mask at boot time. - Support for some more new ARM and MIPS interrupt controllers: tango, alpine-msix and bcm6345-l1 - Two small cleanups for x86/apic which we merged into irq/core to avoid yet another branch in x86 with two tiny commits. - The usual set of updates, cleanups in drivers/irqchip. Mostly in the area of ARM-GIC, arada-37-xp and atmel chips. Nothing outstanding here" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) irqchip/irq-alpine-msi: Release the correct domain on error irqchip/mxs: Fix error check of of_io_request_and_map() irqchip/sunxi-nmi: Fix error check of of_io_request_and_map() genirq: Export IRQ functions for module use irqchip/gic/realview: Support more RealView DCC variants Documentation/bindings: Document the Alpine MSIX driver irqchip: Add the Alpine MSIX interrupt controller irqchip/gic-v3: Always return IRQ_SET_MASK_OK_DONE in gic_set_affinity irqchip/gic-v3-its: Mark its_init() and its children as __init irqchip/gic-v3: Remove gic_root_node variable from the ITS code irqchip/gic-v3: ACPI: Add redistributor support via GICC structures irqchip/gic-v3: Add ACPI support for GICv3/4 initialization irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytes x86/apic: Deinline __default_send_IPI_*, save ~200 bytes dt-bindings: interrupt-controller: Add SoC-specific compatible string to Marvell ODMI irqchip/mips-gic: Add new DT property to reserve IPIs MIPS: Delete smp-gic.c MIPS: Make smp CMP, CPS and MT use the new generic IPI functions MIPS: Add generic SMP IPI support ...
| * | x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytesDenys Vlasenko2016-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _flat_send_IPI_mask: 157 bytes, 3 callsites text data bss dec hex filename 96183823 20860520 36122624 153166967 9212477 vmlinux1_before 96183699 20860520 36122624 153166843 92123fb vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Borislav Petkov <bp@alien.de> Cc: Daniel J Blueman <daniel@numascale.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1457287876-6001-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/apic: Deinline __default_send_IPI_*, save ~200 bytesDenys Vlasenko2016-03-082-56/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __default_send_IPI_shortcut: 49 bytes, 2 callsites __default_send_IPI_dest_field: 108 bytes, 7 callsites text data bss dec hex filename 96184086 20860488 36122624 153167198 921255e vmlinux_before 96183823 20860520 36122624 153166967 9212477 vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Borislav Petkov <bp@alien.de> Cc: Daniel J Blueman <daniel@numascale.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1457287876-6001-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2016-03-153-1/+62
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timer department delivers this time: - Support for cross clock domain timestamps in the core code plus a first user. That allows more precise timestamping for PTP and later for audio and other peripherals. The ptp/e1000e patches have been acked by the relevant maintainers and are carried in the timer tree to avoid merge ordering issues. - Support for unregistering the current clocksource watchdog. That lifts a limitation for switching clocksources which has been there from day 1 - The usual pile of fixes and updates to the core and the drivers. Nothing outstanding and exciting" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) time/timekeeping: Work around false positive GCC warning e1000e: Adds hardware supported cross timestamp on e1000e nic ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping x86/tsc: Always Running Timer (ART) correlated clocksource hrtimer: Revert CLOCK_MONOTONIC_RAW support time: Add history to cross timestamp interface supporting slower devices time: Add driver cross timestamp interface for higher precision time synchronization time: Remove duplicated code in ktime_get_raw_and_real() time: Add timekeeping snapshot code capturing system time and counter time: Add cycles to nanoseconds translation jiffies: Use CLOCKSOURCE_MASK instead of constant clocksource: Introduce clocksource_freq2mult() clockevents/drivers/exynos_mct: Implement ->set_state_oneshot_stopped() clockevents/drivers/arm_global_timer: Implement ->set_state_oneshot_stopped() clockevents/drivers/arm_arch_timer: Implement ->set_state_oneshot_stopped() clocksource/drivers/arm_global_timer: Register delay timer clocksource/drivers/lpc32xx: Support timer-based ARM delay clocksource/drivers/lpc32xx: Support periodic mode clocksource/drivers/lpc32xx: Don't use the prescaler counter for clockevents clocksource/drivers/rockchip: Add err handle for rk_timer_init ...
| * | | x86/tsc: Always Running Timer (ART) correlated clocksourceChristopher S. Hall2016-03-033-1/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On modern Intel systems TSC is derived from the new Always Running Timer (ART). ART can be captured simultaneous to the capture of audio and network device clocks, allowing a correlation between timebases to be constructed. Upon capture, the driver converts the captured ART value to the appropriate system clock using the correlated clocksource mechanism. On systems that support ART a new CPUID leaf (0x15) returns parameters “m” and “n” such that: TSC_value = (ART_value * m) / n + k [n >= 1] [k is an offset that can adjusted by a privileged agent. The IA32_TSC_ADJUST MSR is an example of an interface to adjust k. See 17.14.4 of the Intel SDM for more details] Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: kevin.b.stanton@intel.com Cc: kevin.j.clarke@intel.com Cc: hpa@zytor.com Cc: jeffrey.t.kirsher@intel.com Cc: netdev@vger.kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com> [jstultz: Tweaked to fix build issue, also reworked math for 64bit division on 32bit systems, as well as !CONFIG_CPU_FREQ build fixes] Signed-off-by: John Stultz <john.stultz@linaro.org>
* | | | Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds2016-03-151-4/+4
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer update from Ingo Molnar: "A single simplification of the x86 TSC code" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Use topology functions
| * | | | x86/tsc: Use topology functionsThomas Gleixner2016-02-211-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's simpler to look at the topology mask than iterating over all online cpus to find a cpu on the same package. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
* | | | | Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds2016-03-158-102/+29
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core platform updates from Ingo Molnar: "Intel Quark and Geode SoC platform updates" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/intel/quark: Drop IMR lock bit support x86/platform/intel/mid: Remove dead code x86/platform: Make platform/geode/net5501.c explicitly non-modular x86/platform: Make platform/geode/alix.c explicitly non-modular x86/platform: Make platform/geode/geos.c explicitly non-modular x86/platform: Make platform/intel-quark/imr_selftest.c explicitly non-modular x86/platform: Make platform/intel-quark/imr.c explicitly non-modular
| * | | | | x86/platform/intel/quark: Drop IMR lock bit supportBryan O'Donoghue2016-02-233-27/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Isolated Memory Regions support a lock bit. The lock bit in an IMR prevents modification of the IMR until the core goes through a warm or cold reset. The lock bit feature is not useful in the context of the kernel API and is not really necessary since modification of IMRs is possible only from ring-zero anyway. This patch drops support for IMR locks bits, it simplifies the kernel API and removes an unnecessary and needlessly complex feature. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Cc: boon.leong.ong@intel.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/1456190999-12685-3-git-send-email-pure.logic@nexus-software.ie Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | Merge branch 'x86/urgent' into x86/platform, to queue up dependent patchIngo Molnar2016-02-237-51/+120
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform/intel/mid: Remove dead codeAlan2016-02-172-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Neither ratio nor fsb are ever zero, so remove the 0 case. Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform: Make platform/geode/net5501.c explicitly non-modularPaul Gortmaker2016-02-161-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig:config NET5501 arch/x86/Kconfig: bool "Soekris Engineering net5501 System Support (LEDS, GPIO, etc)" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modularity, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philip Prindeville <philipp@redfish-solutions.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455491396-30977-6-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform: Make platform/geode/alix.c explicitly non-modularPaul Gortmaker2016-02-161-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig:config ALIX arch/x86/Kconfig: bool "PCEngines ALIX System Support (LED setup)" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We replace module.h with moduleparam.h since the file does declare some module parameters, and leaving them as such is currently the easiest way to remain compatible with existing boot arg use cases. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ed Wildgoose <kernel@wildgooses.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455491396-30977-5-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform: Make platform/geode/geos.c explicitly non-modularPaul Gortmaker2016-02-161-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig:config GEOS arch/x86/Kconfig: bool "Traverse Technologies GEOS System Support (LEDS, GPIO, etc)" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modularity, so that when reading the code there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philip Prindeville <philipp@redfish-solutions.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455491396-30977-4-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform: Make platform/intel-quark/imr_selftest.c explicitly non-modularPaul Gortmaker2016-02-161-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: arch/x86/Kconfig.debug:config DEBUG_IMR_SELFTEST arch/x86/Kconfig.debug: bool "Isolated Memory Region self test" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455491396-30977-3-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | x86/platform: Make platform/intel-quark/imr.c explicitly non-modularPaul Gortmaker2016-02-161-33/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: drivers/platform/x86/Kconfig:config INTEL_IMR drivers/platform/x86/Kconfig: bool "Intel Isolated Memory Region support" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455491396-30977-2-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2016-03-1514-101/+215
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - Enable full ASLR randomization for 32-bit programs (Hector Marco-Gisbert) - Add initial minimal INVPCI support, to flush global mappings (Andy Lutomirski) - Add KASAN enhancements (Andrey Ryabinin) - Fix mmiotrace for huge pages (Karol Herbst) - ... misc cleanups and small enhancements" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/32: Enable full randomization on i386 and X86_32 x86/mm/kmmio: Fix mmiotrace for hugepages x86/mm: Avoid premature success when changing page attributes x86/mm/ptdump: Remove paravirt_enabled() x86/mm: Fix INVPCID asm constraint x86/dmi: Switch dmi_remap() from ioremap() [uncached] to ioremap_cache() x86/mm: If INVPCID is available, use it to flush global mappings x86/mm: Add a 'noinvpcid' boot option to turn off INVPCID x86/mm: Add INVPCID helpers x86/kasan: Write protect kasan zero shadow x86/kasan: Clear kasan_zero_page after TLB flush x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug() x86/mm/numa: Clean up numa_clear_kernel_node_hotplug() x86/mm: Make kmap_prot into a #define x86/mm/32: Set NX in __supported_pte_mask before enabling paging x86/mm: Streamline and restore probe_memory_block_size()
| * | | | | | | x86/mm/32: Enable full randomization on i386 and X86_32Hector Marco-Gisbert2016-03-111-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only the stack and the executable are randomized but not other mmapped files (libraries, vDSO, etc.). This patch enables randomization for the libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode. By default on i386 there are 8 bits for the randomization of the libraries, vDSO and mmaps which only uses 1MB of VA. This patch preserves the original randomness, using 1MB of VA out of 3GB or 4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR. The first obvious security benefit is that all objects are randomized (not only the stack and the executable) in legacy mode which highly increases the ASLR effectiveness, otherwise the attackers may use these non-randomized areas. But also sensitive setuid/setgid applications are more secure because currently, attackers can disable the randomization of these applications by setting the ulimit stack to "unlimited". This is a very old and widely known trick to disable the ASLR in i386 which has been allowed for too long. Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE personality flag, but fortunately this doesn't work on setuid/setgid applications because there is security checks which clear Security-relevant flags. This patch always randomizes the mmap_legacy_base address, removing the possibility to disable the ASLR by setting the stack to "unlimited". Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es> Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.es Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm/kmmio: Fix mmiotrace for hugepagesKarol Herbst2016-03-051-29/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because Linux might use bigger pages than the 4K pages to handle those mmio ioremaps, the kmmio code shouldn't rely on the pade id as it currently does. Using the memory address instead of the page id lets us look up how big the page is and what its base address is, so that we won't get a page fault within the same page twice anymore. Tested-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Cc: linux-x86_64@vger.kernel.org Cc: nouveau@lists.freedesktop.org Cc: pq@iki.fi Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1456966991-6861-1-git-send-email-nouveau@karolherbst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm: Avoid premature success when changing page attributesJan Beulich2016-02-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | set_memory_nx() (and set_memory_x()) currently differ in behavior from all other set_memory_*() functions when encountering a virtual address space hole within the kernel address range: They stop processing at the hole, but nevertheless report success (making the caller believe the operation was carried out on the entire range). While observed to be a problem - triggering the CONFIG_DEBUG_WX warning - only with out of tree code, I suspect (but didn't check) that on x86-64 the CONFIG_DEBUG_PAGEALLOC logic in free_init_pages() would, when called from free_initmem(), have the same effect on the set_memory_nx() called from mark_rodata_ro(). This unexpected behavior is a result of change_page_attr_set_clr() special casing changes to only the NX bit, in that it passes "false" as the "checkalias" argument to __change_page_attr_set_clr(). Since this flag becomes the "primary" argument of both __change_page_attr() and __cpa_process_fault(), the latter would so far return success without adjusting cpa->numpages. Success to the higher level callers, however, means that whatever cpa->numpages currently holds is the count of successfully processed pages. The cases when __change_page_attr() calls __cpa_process_fault(), otoh, don't generally mean the entire range got processed (as can be seen from one of the two success return paths in __cpa_process_fault() already adjusting ->numpages). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/56BB0AD402000078000D05BF@prv-mh.provo.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/mm/ptdump: Remove paravirt_enabled()Borislav Petkov2016-02-201-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is_hypervisor_range() can simply check if the PGD index is within ffff800000000000 - ffff87ffffffffff which is the range reserved for a hypervisor. That range is practically an ABI, see Documentation/x86/x86_64/mm.txt. Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Under Xen, as PV guest Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455825641-19585-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm: Fix INVPCID asm constraintBorislav Petkov2016-02-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So we want to specify the dependency on both @pcid and @addr so that the compiler doesn't reorder accesses to them *before* the TLB flush. But for that to work, we need to express this properly in the inline asm and deref the whole desc array, not the pointer to it. See clwb() for an example. This fixes the build error on 32-bit: arch/x86/include/asm/tlbflush.h: In function ‘__invpcid’: arch/x86/include/asm/tlbflush.h:26:18: error: memory input 0 is not directly addressable which gcc4.7 caught but 5.x didn't. Which is strange. :-\ Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Michael Matz <matz@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/dmi: Switch dmi_remap() from ioremap() [uncached] to ioremap_cache()Andy Lutomirski2016-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DMI cacheability is very confused on x86. dmi_early_remap() uses early_ioremap(), which uses FIXMAP_PAGE_IO, which is __PAGE_KERNEL_IO, which is __PAGE_KERNEL, which is cached. Don't ask me why this makes any sense. dmi_remap() uses ioremap(), which requests an uncached mapping. However, on non-EFI systems, the DMI data generally lives between 0xf0000 and 0x100000, which is in the legacy ISA range, which triggers a special case in the PAT code that overrides the cache mode requested by ioremap() and forces a WB mapping. On a UEFI boot, however, the DMI table can live at any physical address. On my laptop, it's around 0x77dd0000. That's nowhere near the legacy ISA range, so the ioremap() implicit uncached type is honored and we end up with a UC- mapping. UC- is a very, very slow way to read from main memory, so dmi_walk() is likely to take much longer than necessary. Given that, even on UEFI, we do early cached DMI reads, it seems safe to just ask for cached access. Switch to ioremap_cache(). I haven't tried to benchmark this, but I'd guess it saves several milliseconds of boot time. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jean Delvare <jdelvare@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/3147c38e51f439f3c8911db34c7d4ab22d854915.1453791969.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm: If INVPCID is available, use it to flush global mappingsAndy Lutomirski2016-02-091-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On my Skylake laptop, INVPCID function 2 (flush absolutely everything) takes about 376ns, whereas saving flags, twiddling CR4.PGE to flush global mappings, and restoring flags takes about 539ns. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/ed0ef62581c0ea9c99b9bf6df726015e96d44743.1454096309.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm: Add a 'noinvpcid' boot option to turn off INVPCIDAndy Lutomirski2016-02-091-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a chicken bit to turn off INVPCID in case something goes wrong. It's an early_param() because we do TLB flushes before we parse __setup() parameters. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/f586317ed1bc2b87aee652267e515b90051af385.1454096309.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm: Add INVPCID helpersAndy Lutomirski2016-02-091-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds helpers for each of the four currently-specified INVPCID modes. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/8a62b23ad686888cee01da134c91409e22064db9.1454096309.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/kasan: Write protect kasan zero shadowAndrey Ryabinin2016-02-091-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After kasan_init() executed, no one is allowed to write to kasan_zero_page, so write protect it. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1452516679-32040-3-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/kasan: Clear kasan_zero_page after TLB flushAndrey Ryabinin2016-02-091-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we clear kasan_zero_page before __flush_tlb_all(). This works with current implementation of native_flush_tlb[_global]() because it doesn't cause do any writes to kasan shadow memory. But any subtle change made in native_flush_tlb*() could break this. Also current code seems doesn't work for paravirt guests (lguest). Only after the TLB flush we can be sure that kasan_zero_page is not used as early shadow anymore (instrumented code will not write to it). So it should cleared it only after the TLB flush. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1452516679-32040-2-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug()Ingo Molnar2016-02-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | numa_clear_kernel_node_hotplug() uses memblock_set_node() without checking for failures. memblock_set_node() is a complex function that might extend the memblock array - which extension might fail - so check for this possibility. It's not supposed to happen (because realistically if we have so little memory that this fails then we likely won't be able to boot anyway), but do the check nevertheless. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Brad Spengler <spender@grsecurity.net> Cc: Chen Tang <imtangchen@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: PaX Team <pageexec@freemail.hu> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: y14sg1 <y14sg1@comcast.net> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | x86/mm/numa: Clean up numa_clear_kernel_node_hotplug()Ingo Molnar2016-02-081-23/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So we fixed an overflow bug in numa_clear_kernel_node_hotplug(): 2b54ab3c66d4 ("x86/mm/numa: Fix memory corruption on 32-bit NUMA kernels") ... and the bug was indirectly caused by poor coding style, such as using start/end local variables unnecessarily, which lost the physaddr_t type. So make the code more readable and try to fully comment all the thinking behind the logic. No change in functionality. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Brad Spengler <spender@grsecurity.net> Cc: Chen Tang <imtangchen@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: PaX Team <pageexec@freemail.hu> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: y14sg1 <y14sg1@comcast.net> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | Merge branch 'x86/urgent' into x86/mm, to pick up dependent fixIngo Molnar2016-02-0855-394/+1663
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | x86/mm: Make kmap_prot into a #defineAndy Lutomirski2016-01-202-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The value (once we initialize it) is a foregone conclusion. Make it a #define to save a tiny amount of text and data size and to make it more comprehensible. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0850eb0213de9da88544ff7fae72dc6d06d2b441.1453239349.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | x86/mm/32: Set NX in __supported_pte_mask before enabling pagingAndy Lutomirski2016-01-202-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a short window in which very early mappings can end up with NX clear because they are created before we've noticed that we have NX. It turns out that we detect NX very early, so there's no need to defer __supported_pte_mask setup. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2b544627345f7110160545a3f47031eb45c3ad4f.1453239349.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | x86/mm: Streamline and restore probe_memory_block_size()Seth Jennings2016-01-191-18/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cumulative effect of the following two commits: bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64 systems") 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit") ... is some pretty convoluted code. The first commit also removed code for the UV case without stated reason, which might lead to unexpected change in behavior. This commit has no other (intended) functional change; just seeks to simplify and make the code more understandable, beyond restoring the UV behavior. The whole section with the "tail size" doesn't seem to be reachable, since both the >= 64GB and < 64GB case return, so it was removed. Signed-off-by: Seth Jennings <sjennings@variantweb.net> Cc: Daniel J Blueman <daniel@numascale.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1448902063-18885-1-git-send-email-sjennings@variantweb.net [ Rewrote the title and changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | | | Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds2016-03-157-203/+228
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode updates from Ingo Molnar: "The biggest change in this cycle was the separation of the microcode loading mechanism from the initrd code plus the support of built-in microcode images. There were also lots cleanups and general restructuring (by Borislav Petkov)" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/microcode/intel: Drop orig_sum from ext signature checksum x86/microcode/intel: Improve microcode sanity-checking error messages x86/microcode/intel: Merge two consecutive if-statements x86/microcode/intel: Get rid of DWSIZE x86/microcode/intel: Change checksum variables to u32 x86/microcode: Use kmemdup() rather than duplicating its implementation x86/microcode: Remove unnecessary paravirt_enabled check x86/microcode: Document builtin microcode loading method x86/microcode/AMD: Issue microcode updated message later x86/microcode/intel: Cleanup get_matching_model_microcode() x86/microcode/intel: Remove unused arg of get_matching_model_microcode() x86/microcode/intel: Rename mc_saved_in_initrd x86/microcode/intel: Use *wrmsrl variants x86/microcode/intel: Cleanup apply_microcode_intel() x86/microcode/intel: Move the BUG_ON up and turn it into WARN_ON x86/microcode/intel: Rename mc_intel variable to mc x86/microcode/intel: Rename mc_saved_count to num_saved x86/microcode/intel: Rename local variables of type struct mc_saved_data x86/microcode/AMD: Drop redundant printk prefix x86/microcode: Issue update message only once ...
| * | | | | | | | | x86/microcode/intel: Drop orig_sum from ext signature checksumBorislav Petkov2016-03-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is 0 because for !0 values we would have exited already. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1457345404-28884-6-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | x86/microcode/intel: Improve microcode sanity-checking error messagesBorislav Petkov2016-03-081-10/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn them into proper sentences. Add comments to microcode_sanity_check() to explain what it does. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1457345404-28884-5-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | x86/microcode/intel: Merge two consecutive if-statementsBorislav Petkov2016-03-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the two consecutive "if (ext_table_size)". No functional change. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1457345404-28884-4-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | x86/microcode/intel: Get rid of DWSIZEBorislav Petkov2016-03-082-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sizeof(u32) is perfectly clear as it is. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1457345404-28884-3-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | x86/microcode/intel: Change checksum variables to u32Chris Bainbridge2016-03-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Microcode checksum verification should be done using unsigned 32-bit values otherwise the calculation overflow results in undefined behaviour. This is also nicely documented in the SDM, section "Microcode Update Checksum": "To check for a corrupt microcode update, software must perform a unsigned DWORD (32-bit) checksum of the microcode update. Even though some fields are signed, the checksum procedure treats all DWORDs as unsigned. Microcode updates with a header version equal to 00000001H must sum all DWORDs that comprise the microcode update. A valid checksum check will yield a value of 00000000H." but for some reason the code has been using ints from the very beginning. In practice, this bug possibly manifested itself only when doing the microcode data checksum - apparently, currently shipped Intel microcode doesn't have an extended signature table for which we do checksum verification too. UBSAN: Undefined behaviour in arch/x86/kernel/cpu/microcode/intel_lib.c:105:12 signed integer overflow: -1500151068 + -2125470173 cannot be represented in type 'int' CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.0-rc5+ #495 ... Call Trace: dump_stack ? inotify_ioctl ubsan_epilogue handle_overflow __ubsan_handle_add_overflow microcode_sanity_check get_matching_model_microcode.isra.2.constprop.8 ? early_idt_handler_common ? strlcpy ? find_cpio_data load_ucode_intel_bsp load_ucode_bsp ? load_ucode_bsp x86_64_start_kernel [ Expand and massage commit message. ] Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: hmh@hmh.eng.br Link: http://lkml.kernel.org/r/1456834359-5132-1-git-send-email-chris.bainbridge@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | x86/microcode: Use kmemdup() rather than duplicating its implementationAndrzej Hajda2016-02-172-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch was generated using fixed coccinelle semantic patch scripts/coccinelle/api/memdup.cocci. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455612202-14414-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode: Remove unnecessary paravirt_enabled checkBoris Ostrovsky2016-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit: a18a0f6850d4 ("x86, microcode: Don't initialize microcode code on paravirt") added a paravirt test in microcode_init(), primarily to avoid making mc_bp_resume()->load_ucode_ap()->check_loader_disabled_ap() calls because on 32-bit kernels this callchain ends up using __pa_nodebug() macro which is invalid for Xen PV guests. A subsequent commit: fbae4ba8c4a3 ("x86, microcode: Reload microcode on resume") eliminated this callchain thus making a18a0f6850d4 unnecessary. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: david.vrabel@citrix.com Cc: konrad.wilk@oracle.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1455612202-14414-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/AMD: Issue microcode updated message laterBorislav Petkov2016-02-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this, we issued this message from save_microcode_in_initrd() which is called from free_initrd_mem(), i.e., only when we have an initrd enabled. However, we can update from builtin microcode too but then we don't issue the update message. Fix it by issuing that message on the generic driver init path. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-17-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/intel: Cleanup get_matching_model_microcode()Borislav Petkov2016-02-091-14/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reflow arguments, sort local variables in reverse christmas tree, kill "out" label. No functionality change. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-16-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/intel: Remove unused arg of get_matching_model_microcode()Borislav Petkov2016-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | @cpu is unused, kill it. No functionality change. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-15-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/intel: Rename mc_saved_in_initrdBorislav Petkov2016-02-091-24/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename it to mc_tmp_ptrs to denote better what it is - a temporary array for saving pointers to microcode blobs. And "initrd" is not accurate anymore since initrd is not the only source for early microcode. Therefore, rename copy_initrd_ptrs() to copy_ptrs() simply and "initrd_start" to "offset". And then do the following convention: the global variable is called "mc_tmp_ptrs" and the local function arguments "mc_ptrs" for differentiation. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-14-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/intel: Use *wrmsrl variantsBorislav Petkov2016-02-091-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and drop the 32-bit casting games which we had to do at the time because wrmsr() was unforgiving then, see c3fd0bd5e19a from the full history tree: commit c3fd0bd5e19aaff9cdd104edff136a2023db657e Author: Linus Torvalds <torvalds@home.osdl.org> Date: Tue Feb 17 23:23:41 2004 -0800 Fix up the microcode update on regular 32-bit x86. Our wrmsr() is a bit unforgiving and really doesn't like 64-bit values. ... Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-13-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | x86/microcode/intel: Cleanup apply_microcode_intel()Borislav Petkov2016-02-091-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of local variable cpu_num as it is equal to @cpu now. Deref cpu_data() only when it is really needed at the end. No functionality change. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-12-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
OpenPOWER on IntegriCloud