summaryrefslogtreecommitdiffstats
path: root/arch/arm/include/asm/kvm_host.h
Commit message (Collapse)AuthorAgeFilesLines
...
* | KVM: arm64: Handle RAS SErrors from EL1 on guest exitJames Morse2018-01-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We expect to have firmware-first handling of RAS SErrors, with errors notified via an APEI method. For systems without firmware-first, add some minimal handling to KVM. There are two ways KVM can take an SError due to a guest, either may be a RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO, or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit. For SError that interrupt a guest and are routed to EL2 the existing behaviour is to inject an impdef SError into the guest. Add code to handle RAS SError based on the ESR. For uncontained and uncategorized errors arm64_is_fatal_ras_serror() will panic(), these errors compromise the host too. All other error types are contained: For the fatal errors the vCPU can't make progress, so we inject a virtual SError. We ignore contained errors where we can make progress as if we're lucky, we may not hit them again. If only some of the CPUs support RAS the guest will see the cpufeature sanitised version of the id registers, but we may still take RAS SError on this CPU. Move the SError handling out of handle_exit() into a new handler that runs before we can be preempted. This allows us to use this_cpu_has_cap(), via arm64_is_ras_serror(). Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | KVM: arm/arm64: mask/unmask daif around VHE guestsJames Morse2018-01-161-0/+2
|/ | | | | | | | | | | | | | | | | | | | | | | Non-VHE systems take an exception to EL2 in order to world-switch into the guest. When returning from the guest KVM implicitly restores the DAIF flags when it returns to the kernel at EL1. With VHE none of this exception-level jumping happens, so KVMs world-switch code is exposed to the host kernel's DAIF values, and KVM spills the guest-exit DAIF values back into the host kernel. On entry to a guest we have Debug and SError exceptions unmasked, KVM has switched VBAR but isn't prepared to handle these. On guest exit Debug exceptions are left disabled once we return to the host and will stay this way until we enter user space. Add a helper to mask/unmask DAIF around VHE guests. The unmask can only happen after the hosts VBAR value has been synchronised by the isb in __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as setting KVMs VBAR value, but is kept here for symmetry. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* KVM: arm/arm64: debug: Introduce helper for single-stepAlex Bennée2017-11-291-0/+5
| | | | | | | | | | | After emulating instructions we may want return to user-space to handle single-step debugging. Introduce a helper function, which, if single-step is enabled, sets the run structure for return and returns true. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm64/sve: KVM: Prevent guests from using SVEDave Martin2017-11-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until KVM has full SVE support, guests must not be allowed to execute SVE instructions. This patch enables the necessary traps, and also ensures that the traps are disabled again on exit from the guest so that the host can still use SVE if it wants to. On guest exit, high bits of the SVE Zn registers may have been clobbered as a side-effect the execution of FPSIMD instructions in the guest. The existing KVM host FPSIMD restore code is not sufficient to restore these bits, so this patch explicitly marks the CPU as not containing cached vector state for any task, thus forcing a reload on the next return to userspace. This is an interim measure, in advance of adding full SVE awareness to KVM. This marking of cached vector state in the CPU as invalid is done using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c. Due to the repeated use of this rather obscure operation, it makes sense to factor it out as a separate helper with a clearer name. This patch factors it out as fpsimd_flush_cpu_state(), and ports all callers to use it. As a side effect of this refactoring, a this_cpu_write() in fpsimd_cpu_pm_notifier() is changed to __this_cpu_write(). This should be fine, since cpu_pm_enter() is supposed to be called only with interrupts disabled. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* KVM: update to new mmu_notifier semantic v2Jérôme Glisse2017-08-311-6/+0
| | | | | | | | | | | | | | | | | | | | | | Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Changed since v1 (Linus Torvalds) - remove now useless kvm_arch_mmu_notifier_invalidate_page() Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Mike Galbraith <efault@gmx.de> Tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: arm: Handle VCPU device attributes in guest.cChristoffer Dall2017-06-081-15/+7
| | | | | | | | | As we are about to support VCPU attributes to set the timer IRQ numbers in guest.c, move the static inlines for the VCPU attributes handlers from the header file to guest.c. Signed-off-by: Christoffer Dall <cdall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
* KVM: arm/arm64: use vcpu requests for irq injectionAndrew Jones2017-06-041-0/+1
| | | | | | | | | | | | | | | | | Don't use request-less VCPU kicks when injecting IRQs, as a VCPU kick meant to trigger the interrupt injection could be sent while the VCPU is outside guest mode, which means no IPI is sent, and after it has called kvm_vgic_flush_hwstate(), meaning it won't see the updated GIC state until its next exit some time later for some other reason. The receiving VCPU only needs to check this request in VCPU RUN to handle it. By checking it, if it's pending, a memory barrier will be issued that ensures all state is visible. See "Ensuring Requests Are Seen" of Documentation/virtual/kvm/vcpu-requests.rst Signed-off-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* KVM: arm/arm64: change exit request to sleep requestAndrew Jones2017-06-041-1/+1
| | | | | | | | | | | | | A request called EXIT is too generic. All requests are meant to cause exits, but different requests have different flags. Let's not make it difficult to decide if the EXIT request is correct for some case by just always providing unique requests for each case. This patch changes EXIT to SLEEP, because that's what the request is asking the VCPU to do. Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* KVM: improve arch vcpu request definingAndrew Jones2017-06-041-1/+2
| | | | | | | | | | | | | | | | | Marc Zyngier suggested that we define the arch specific VCPU request base, rather than requiring each arch to remember to start from 8. That suggestion, along with Radim Krcmar's recent VCPU request flag addition, snowballed into defining something of an arch VCPU request defining API. No functional change. (Looks like x86 is running out of arch VCPU request bits. Maybe someday we'll need to extend to 64.) Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* KVM: arm/arm64: Simplify active_change_prepare and plug raceChristoffer Dall2017-05-231-2/+0
| | | | | | | | | | | | | | | | | We don't need to stop a specific VCPU when changing the active state, because private IRQs can only be modified by a running VCPU for the VCPU itself and it is therefore already stopped. However, it is also possible for two VCPUs to be modifying the active state of SPIs at the same time, which can cause the thread being stuck in the loop that checks other VCPU threads for a potentially very long time, or to modify the active state of a running VCPU. Fix this by serializing all accesses to setting and clearing the active state of interrupts using the KVM mutex. Reported-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
* Merge tag 'kvm-arm-for-v4.12' of ↵Paolo Bonzini2017-04-271-6/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.12. Changes include: - Using the common sysreg definitions between KVM and arm64 - Improved hyp-stub implementation with support for kexec and kdump on the 32-bit side - Proper PMU exception handling - Performance improvements of our GIC handling - Support for irqchip in userspace with in-kernel arch-timers and PMU support - A fix for a race condition in our PSCI code Conflicts: Documentation/virtual/kvm/api.txt include/uapi/linux/kvm.h
| * arm/arm64: KVM: Use __hyp_reset_vectors() directlyMarc Zyngier2017-04-091-6/+0
| | | | | | | | | | | | | | | | | | | | | | __cpu_reset_hyp_mode doesn't need to be passed any argument now, as the hyp-stub implementations are self-contained, and is now reduced to just calling __hyp_reset_vectors(). Let's drop the wrapper and use the stub hypercall directly. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
| * ARM: KVM: Convert __cpu_reset_hyp_mode to using __hyp_reset_vectorsMarc Zyngier2017-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | We are now able to use the hyp stub to reset HYP mode. Time to kiss __kvm_hyp_reset goodbye, and use __hyp_reset_vectors. Tested-by: Keerthy <j-keerthy@ti.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* | KVM: mark requests that need synchronizationPaolo Bonzini2017-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_make_all_requests() provides a synchronization that waits until all kicked VCPUs have acknowledged the kick. This is important for KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is underway. This patch adds the synchronization property into all requests that are currently being used with kvm_make_all_requests() in order to preserve the current behavior and only introduce a new framework. Removing it from requests where it is not necessary is left for future patches. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: mark requests that do not need a wakeupRadim Krčmář2017-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some operations must ensure that the guest is not running with stale data, but if the guest is halted, then the update can wait until another event happens. kvm_make_all_requests() currently doesn't wake up, so we can mark all requests used with it. First 8 bits were arbitrarily reserved for request numbers. Most uses of requests have the request type as a constant, so a compiler will optimize the '&'. An alternative would be to have an inline function that would return whether the request needs a wake-up or not, but I like this one better even though it might produce worse assembly. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET publicPaolo Bonzini2017-04-071-1/+0
|/ | | | | | | | | | | Its value has never changed; we might as well make it part of the ABI instead of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO). Because PPC does not always make MMIO available, the code has to be made dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unusedLinu Cherian2017-03-091-1/+0
| | | | | | | | | arm/arm64 architecture doesnt use private memslots, hence removing KVM_PRIVATE_MEM_SLOTS macro definition. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* KVM: arm/arm64: Move cntvoff to each timer contextJintack Lim2017-02-081-3/+0
| | | | | | | | | | | | | Make cntvoff per each timer context. This is helpful to abstract kvm timer functions to work with timer context without considering timer types (e.g. physical timer or virtual timer). This also would pave the way for ever doing adjustments of the cntvoff on a per-CPU basis if that should ever make sense. Signed-off-by: Jintack Lim <jintack@cs.columbia.edu> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a ↵Marc Zyngier2016-11-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | single CPU Architecturally, TLBs are private to the (physical) CPU they're associated with. But when multiple vcpus from the same VM are being multiplexed on the same CPU, the TLBs are not private to the vcpus (and are actually shared across the VMID). Let's consider the following scenario: - vcpu-0 maps PA to VA - vcpu-1 maps PA' to VA If run on the same physical CPU, vcpu-1 can hit TLB entries generated by vcpu-0 accesses, and access the wrong physical page. The solution to this is to keep a per-VM map of which vcpu ran last on each given physical CPU, and invalidate local TLBs when switching to a different vcpu from the same VM. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* Merge tag 'kvm-arm-for-v4.9' of ↵Radim Krčmář2016-09-291-0/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into next KVM/ARM Changes for v4.9 - Various cleanups and removal of redundant code - Two important fixes for not using an in-kernel irqchip - A bit of optimizations - Handle SError exceptions and present them to guests if appropriate - Proxying of GICV access at EL2 if guest mappings are unsafe - GICv3 on AArch32 on ARMv8 - Preparations for GICv3 save/restore, including ABI docs
| * ARM: KVM: Support vgic-v3Vladimir Murzin2016-09-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows to build and use vgic-v3 in 32-bit mode. Unfortunately, it can not be split in several steps without extra stubs to keep patches independent and bisectable. For instance, virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre to be already defined. It is how support has been done: * handle SGI requests from the guest * report configured SRE on access to GICv3 cpu interface from the guest * required vgic-v3 macros are provided via uapi.h * static keys are used to select GIC backend * to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with the static inlines Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* | KVM: Add provisioning for ulong vm stats and u64 vcpu statsSuraj Jitindar Singh2016-09-081-6/+6
|/ | | | | | | | | | | | | | | | | | | | | | vms and vcpus have statistics associated with them which can be viewed within the debugfs. Currently it is assumed within the vcpu_stat_get() and vm_stat_get() functions that all of these statistics are represented as u32s, however the next patch adds some u64 vcpu statistics. Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly. Since vcpu statistics are per vcpu, they will only be updated by a single vcpu at a time so this shouldn't present a problem on 32-bit machines which can't atomically increment 64-bit numbers. However vm statistics could potentially be updated by multiple vcpus from that vm at a time. To avoid the overhead of atomics make all vm statistics ulong such that they are 64-bit on 64-bit systems where they can be atomically incremented and are 32-bit on 32-bit systems which may not be able to atomically increment 64-bit numbers. Modify vm_stat_get() to expect ulongs. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Matlack <dmatlack@google.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
* KVM: arm/arm64: Extend arch CAP checks to allow per-VM capabilitiesAndre Przywara2016-07-181-1/+1
| | | | | | | | | | | | | | | KVM capabilities can be a per-VM property, though ARM/ARM64 currently does not pass on the VM pointer to the architecture specific capability handlers. Add a "struct kvm*" parameter to those function to later allow proper per-VM capability reporting. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* arm: KVM: Allow hyp teardownMarc Zyngier2016-07-031-5/+3
| | | | | | | | | | | So far, KVM was getting in the way of kexec on 32bit (and the arm64 kexec hackers couldn't be bothered to fix it on 32bit...). With simpler page tables, tearing KVM down becomes very easy, so let's just do it. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm: KVM: Simplify HYP initMarc Zyngier2016-07-031-10/+5
| | | | | | | | | Just like for arm64, we can now make the HYP setup a lot simpler, and we can now initialise it in one go (instead of the two phases we currently have). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: Drop boot_pgdMarc Zyngier2016-07-031-5/+3
| | | | | | | | | Since we now only have one set of page tables, the concept of boot_pgd is useless and can be removed. We still keep it as an element of the "extended idmap" thing. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2016-05-271-0/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull second batch of KVM updates from Radim Krčmář: "General: - move kvm_stat tool from QEMU repo into tools/kvm/kvm_stat (kvm_stat had nothing to do with QEMU in the first place -- the tool only interprets debugfs) - expose per-vm statistics in debugfs and support them in kvm_stat (KVM always collected per-vm statistics, but they were summarised into global statistics) x86: - fix dynamic APICv (VMX was improperly configured and a guest could access host's APIC MSRs, CVE-2016-4440) - minor fixes ARM changes from Christoffer Dall: - new vgic reimplementation of our horribly broken legacy vgic implementation. The two implementations will live side-by-side (with the new being the configured default) for one kernel release and then we'll remove the legacy one. - fix for a non-critical issue with virtual abort injection to guests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits) tools: kvm_stat: Add comments tools: kvm_stat: Introduce pid monitoring KVM: Create debugfs dir and stat files for each VM MAINTAINERS: Add kvm tools tools: kvm_stat: Powerpc related fixes tools: Add kvm_stat man page tools: Add kvm_stat vm monitor script kvm:vmx: more complete state update on APICv on/off KVM: SVM: Add more SVM_EXIT_REASONS KVM: Unify traced vector format svm: bitwise vs logical op typo KVM: arm/arm64: vgic-new: Synchronize changes to active state KVM: arm/arm64: vgic-new: enable build KVM: arm/arm64: vgic-new: implement mapped IRQ handling KVM: arm/arm64: vgic-new: Wire up irqfd injection KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable KVM: arm/arm64: vgic-new: vgic_init: implement map_resources KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init ...
| * Merge tag 'kvm-arm-for-4-7-take2' of ↵Paolo Bonzini2016-05-241-0/+6
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next KVM/ARM Changes for v4.7 take 2 "The GIC is dead; Long live the GIC" This set of changes include the new vgic, which is a reimplementation of our horribly broken legacy vgic implementation. The two implementations will live side-by-side (with the new being the configured default) for one kernel release and then we'll remove it. Also fixes a non-critical issue with virtual abort injection to guests.
| | * KVM: arm/arm64: vgic-new: Synchronize changes to active stateChristoffer Dall2016-05-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When modifying the active state of an interrupt via the MMIO interface, we should ensure that the write has the intended effect. If a guest sets an interrupt to active, but that interrupt is already flushed into a list register on a running VCPU, then that VCPU will write the active state back into the struct vgic_irq upon returning from the guest and syncing its state. This is a non-benign race, because the guest can observe that an interrupt is not active, and it can have a reasonable expectations that other VCPUs will not ack any IRQs, and then set the state to active, and expect it to stay that way. Currently we are not honoring this case. Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the world, change the irq state, potentially queue the irq if we're setting it to active, and then continue. We take this chance to slightly optimize these functions by not stopping the world when touching private interrupts where there is inherently no possible race. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
| | * KVM: arm/arm64: Provide functionality to pause and resume a guestChristoffer Dall2016-05-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | For some rare corner cases in our VGIC emulation later we have to stop the guest to make sure the VGIC state is consistent. Provide the necessary framework to pause and resume a guest. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2016-05-191-0/+2
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "Small release overall. x86: - miscellaneous fixes - AVIC support (local APIC virtualization, AMD version) s390: - polling for interrupts after a VCPU goes to halted state is now enabled for s390 - use hardware provided information about facility bits that do not need any hypervisor activity, and other fixes for cpu models and facilities - improve perf output - floating interrupt controller improvements. MIPS: - miscellaneous fixes PPC: - bugfixes only ARM: - 16K page size support - generic firmware probing layer for timer and GIC Christoffer Dall (KVM-ARM maintainer) says: "There are a few changes in this pull request touching things outside KVM, but they should all carry the necessary acks and it made the merge process much easier to do it this way." though actually the irqchip maintainers' acks didn't make it into the patches. Marc Zyngier, who is both irqchip and KVM-ARM maintainer, later acked at http://mid.gmane.org/573351D1.4060303@arm.com ('more formally and for documentation purposes')" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (82 commits) KVM: MTRR: remove MSR 0x2f8 KVM: x86: make hwapic_isr_update and hwapic_irr_update look the same svm: Manage vcpu load/unload when enable AVIC svm: Do not intercept CR8 when enable AVIC svm: Do not expose x2APIC when enable AVIC KVM: x86: Introducing kvm_x86_ops.apicv_post_state_restore svm: Add VMEXIT handlers for AVIC svm: Add interrupt injection via AVIC KVM: x86: Detect and Initialize AVIC support svm: Introduce new AVIC VMCB registers KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick KVM: x86: Introducing kvm_x86_ops VCPU blocking/unblocking hooks KVM: x86: Introducing kvm_x86_ops VM init/destroy hooks KVM: x86: Rename kvm_apic_get_reg to kvm_lapic_get_reg KVM: x86: Misc LAPIC changes to expose helper functions KVM: shrink halt polling even more for invalid wakeups KVM: s390: set halt polling to 80 microseconds KVM: halt_polling: provide a way to qualify wakeups during poll KVM: PPC: Book3S HV: Re-enable XICS fast path for irqfd-generated interrupts kvm: Conditionally register IRQ bypass consumer ...
| * | KVM: halt_polling: provide a way to qualify wakeups during pollChristian Borntraeger2016-05-131-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some wakeups should not be considered a sucessful poll. For example on s390 I/O interrupts are usually floating, which means that _ALL_ CPUs would be considered runnable - letting all vCPUs poll all the time for transactional like workload, even if one vCPU would be enough. This can result in huge CPU usage for large guests. This patch lets architectures provide a way to qualify wakeups if they should be considered a good/bad wakeups in regard to polls. For s390 the implementation will fence of halt polling for anything but known good, single vCPU events. The s390 implementation for floating interrupts does a wakeup for one vCPU, but the interrupt will be delivered by whatever CPU checks first for a pending interrupt. We prefer the woken up CPU by marking the poll of this CPU as "good" poll. This code will also mark several other wakeup reasons like IPI or expired timers as "good". This will of course also mark some events as not sucessful. As KVM on z runs always as a 2nd level hypervisor, we prefer to not poll, unless we are really sure, though. This patch successfully limits the CPU usage for cases like uperf 1byte transactional ping pong workload or wakeup heavy workload like OLTP while still providing a proper speedup. This also introduced a new vcpu stat "halt_poll_no_tuning" that marks wakeups that are considered not good for polling. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version) Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> [Rename config symbol. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* / arm64: kvm: allows kvm cpu hotplugAKASHI Takahiro2016-04-281-1/+9
|/ | | | | | | | | | | | | | | | | | | | | | | | The current kvm implementation on arm64 does cpu-specific initialization at system boot, and has no way to gracefully shutdown a core in terms of kvm. This prevents kexec from rebooting the system at EL2. This patch adds a cpu tear-down function and also puts an existing cpu-init code into a separate function, kvm_arch_hardware_disable() and kvm_arch_hardware_enable() respectively. We don't need the arm64 specific cpu hotplug hook any more. Since this patch modifies common code between arm and arm64, one stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid compilation errors. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> [Rebase, added separate VHE init/exit path, changed resets use of kvm_call_hyp() to the __version, en/disabled hardware in init_subsystems(), added icache maintenance to __kvm_hyp_reset() and removed lr restore, removed guest-enter after teardown handling] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: KVM: Add a new vcpu device control group for PMUv3Shannon Zhao2016-02-291-0/+15
| | | | | | | | | | | | | | | To configure the virtual PMUv3 overflow interrupt number, we use the vcpu kvm_device ioctl, encapsulating the KVM_ARM_VCPU_PMU_V3_IRQ attribute within the KVM_ARM_VCPU_PMU_V3_CTRL group. After configuring the PMUv3, call the vcpu ioctl with attribute KVM_ARM_VCPU_PMU_V3_INIT to initialize the PMUv3. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Remove unused hyp_pc fieldMarc Zyngier2016-02-291-1/+0
| | | | | | | | This field was never populated, and the panic code already does something similar. Delete the related code. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Turn CP15 defines to an enumMarc Zyngier2016-02-291-0/+39
| | | | | | | | | Just like on arm64, having the CP15 registers expressed as a set of #defines has been very conflict-prone. Let's turn it into an enum, which should make it more manageable. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Switch to C-based stage2 initMarc Zyngier2016-02-291-0/+1
| | | | | | | | As we now have hooks to setup VTCR from C code, let's drop the original VTCR setup and reimplement it as part of the HYP code. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Change kvm_call_hyp return type to unsigned longMarc Zyngier2016-02-291-1/+1
| | | | | | | | | | | | | | | | Having u64 as the kvm_call_hyp return type is problematic, as it forces all kind of tricks for the return values from HYP to be promoted to 64bit (LE has the LSB in r0, and BE has them in r1). Since the only user of the return value is perfectly happy with a 32bit value, let's make kvm_call_hyp return an unsigned long, which is 32bit on ARM. This solves yet another headache. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Move GP registers into the CPU context structureMarc Zyngier2016-02-291-2/+1
| | | | | | | | Continuing our rework of the CPU context, we now move the GP registers into the CPU context structure. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Move CP15 array into the CPU context structureMarc Zyngier2016-02-291-3/+3
| | | | | | | | | | | Continuing our rework of the CPU context, we now move the CP15 array into the CPU context structure. As this causes quite a bit of churn, we introduce the vcpu_cp15() macro that abstract the location of the actual array. This will probably help next time we have to revisit that code. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* ARM: KVM: Move VFP registers to a CPU context structureMarc Zyngier2016-02-291-4/+7
| | | | | | | | | In order to turn the WS code into something that looks a bit more like the arm64 version, move the VFP registers into a CPU context container for both the host and the guest. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* arm/arm64: KVM: Add hook for C-based stage2 initMarc Zyngier2016-02-291-0/+4
| | | | | | | | As we're about to move the stage2 init to C code, introduce some C hooks that will later be populated with arch-specific implementations. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* KVM: arm/arm64: Count guest exit due to various reasonsAmit Tomar2015-12-141-0/+6
| | | | | | | | | | It would add guest exit statistics to debugfs, this can be helpful while measuring KVM performance. [ Renamed some of the field names - Christoffer ] Signed-off-by: Amit Singh Tomar <amittomer25@gmail.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* KVM: arm/arm64: implement kvm_arm_[halt,resume]_guestEric Auger2015-10-221-0/+3
| | | | | | | | | | | | | | We introduce kvm_arm_halt_guest and resume functions. They will be used for IRQ forward state change. Halt is synchronous and prevents the guest from being re-entered. We use the same mechanism put in place for PSCI former pause, now renamed power_off. A new flag is introduced in arch vcpu state, pause, only meant to be used by those functions. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* KVM: arm/arm64: rename pause into power_offEric Auger2015-10-221-2/+2
| | | | | | | | | | | The kvm_vcpu_arch pause field is renamed into power_off to prepare for the introduction of a new pause field. Also vcpu_pause is renamed into vcpu_sleep since we will sleep until both power_off and pause are false. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_blockChristoffer Dall2015-10-221-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently schedule a soft timer every time we exit the guest if the timer did not expire while running the guest. This is really not necessary, because the only work we do in the timer work function is to kick the vcpu. Kicking the vcpu does two things: (1) If the vpcu thread is on a waitqueue, make it runnable and remove it from the waitqueue. (2) If the vcpu is running on a different physical CPU from the one doing the kick, it sends a reschedule IPI. The second case cannot happen, because the soft timer is only ever scheduled when the vcpu is not running. The first case is only relevant when the vcpu thread is on a waitqueue, which is only the case when the vcpu thread has called kvm_vcpu_block(). Therefore, we only need to make sure a timer is scheduled for kvm_vcpu_block(), which we do by encapsulating all calls to kvm_vcpu_block() with kvm_timer_{un}schedule calls. Additionally, we only schedule a soft timer if the timer is enabled and unmasked, since it is useless otherwise. Note that theoretically userspace can use the SET_ONE_REG interface to change registers that should cause the timer to fire, even if the vcpu is blocked without a scheduled timer, but this case was not supported before this patch and we leave it for future work for now. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* KVM: Add kvm_arch_vcpu_{un}blocking callbacksChristoffer Dall2015-10-221-0/+3
| | | | | | | | | | | | | Some times it is useful for architecture implementations of KVM to know when the VCPU thread is about to block or when it comes back from blocking (arm/arm64 needs to know this to properly implement timers, for example). Therefore provide a generic architecture callback function in line with what we do elsewhere for KVM generic-arch interactions. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand2015-09-251-0/+1
| | | | | | | | | | | | We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge tag 'kvm-arm-for-4.3-rc2-2' of ↵Paolo Bonzini2015-09-171-6/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master Second set of KVM/ARM changes for 4.3-rc2 - Workaround for a Cortex-A57 erratum - Bug fix for the debugging infrastructure - Fix for 32bit guests with more than 4GB of address space on a 32bit host - A number of fixes for the (unusual) case when we don't use the in-kernel GIC emulation - Removal of ThumbEE handling on arm64, since these have been dropped from the architecture before anyone actually ever built a CPU - Remove the KVM_ARM_MAX_VCPUS limitation which has become fairly pointless
| * arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'Ming Lei2015-09-171-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes config option of KVM_ARM_MAX_VCPUS, and like other ARCHs, just choose the maximum allowed value from hardware, and follows the reasons: 1) from distribution view, the option has to be defined as the max allowed value because it need to meet all kinds of virtulization applications and need to support most of SoCs; 2) using a bigger value doesn't introduce extra memory consumption, and the help text in Kconfig isn't accurate because kvm_vpu structure isn't allocated until request of creating VCPU is sent from QEMU; 3) the main effect is that the field of vcpus[] in 'struct kvm' becomes a bit bigger(sizeof(void *) per vcpu) and need more cache lines to hold the structure, but 'struct kvm' is one generic struct, and it has worked well on other ARCHs already in this way. Also, the world switch frequecy is often low, for example, it is ~2000 when running kernel building load in VM from APM xgene KVM host, so the effect is very small, and the difference can't be observed in my test at all. Cc: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
OpenPOWER on IntegriCloud