summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAgeFilesLines
* KVM: SVM: Lazy fpu with nptAvi Kivity2010-03-011-8/+0
| | | | | | | | Now that we can allow the guest to play with cr0 when the fpu is loaded, we can enable lazy fpu when npt is in use. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Selective cr0 interceptAvi Kivity2010-03-011-6/+26
| | | | | | | | | | | | | | If two conditions apply: - no bits outside TS and EM differ between the host and guest cr0 - the fpu is active then we can activate the selective cr0 write intercept and drop the unconditional cr0 read and write intercept, and allow the guest to run with the host fpu state. This reduces cr0 exits due to guest fpu management while the guest fpu is loaded. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Restore unconditional cr0 intercept under nptAvi Kivity2010-03-011-22/+7
| | | | | | | | | | Currently we don't intercept cr0 at all when npt is enabled. This improves performance but requires us to activate the fpu at all times. Remove this behaviour in preparation for adding selective cr0 intercepts. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Initialize fpu_active in init_vmcb()Avi Kivity2010-03-011-1/+2
| | | | | | | | | init_vmcb() sets up the intercepts as if the fpu is active, so initialize it there. This avoids an INIT from setting up intercepts inconsistent with fpu_active. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Set cr0.et when the guest writes cr0Avi Kivity2010-03-011-0/+2
| | | | | | Follow the hardware. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Give the guest ownership of cr0.ts when the fpu is activeAvi Kivity2010-03-011-2/+9
| | | | | | | | | | | | If the guest fpu is loaded, there is nothing interesing about cr0.ts; let the guest play with it as it will. This makes context switches between fpu intensive guest processes faster, as we won't trap the clts and cr0 write instructions. [marcelo: fix cr0 read shadow update on fpu deactivation; kills F8 install] Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Lazify fpu activation and deactivationAvi Kivity2010-03-013-31/+36
| | | | | | | | | | | Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep it loaded until the next heavyweight exit (where we are forced to unload it). This reduces unnecessary exits. We also defer fpu activation on clts; while clts signals the intent to use the fpu, we can't be sure the guest will actually use it. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Allow the guest to own some cr0 bitsAvi Kivity2010-03-013-0/+16
| | | | | | We will use this later to give the guest ownership of cr0.ts. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Replace read accesses of vcpu->arch.cr0 by an accessorAvi Kivity2010-03-017-27/+38
| | | | | | | Since we'd like to allow the guest to own a few bits of cr0 at times, we need to know when we access those bits. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: trace clts and lmsw instructions as cr accessesAvi Kivity2010-03-011-1/+4
| | | | | | clts writes cr0.ts; lmsw writes cr0[0:15] - record that in ftrace. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Enable EPT 1GB page supportSheng Yang2010-03-012-4/+15
| | | | | Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_opsSheng Yang2010-03-013-7/+9
| | | | | | | | | | Then the callback can provide the maximum supported large page level, which is more flexible. Also move the gb page support into x86_64 specific. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Moving PT_*_LEVEL to mmu.hSheng Yang2010-03-012-4/+4
| | | | | | | We can use them in x86.c and vmx.c now... Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Fill out ftrace exit reason stringsAvi Kivity2010-03-011-19/+39
| | | | | | | Some exit reasons missed their strings; fill out the table. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: convert slots_lock to a mutexMarcelo Tosatti2010-03-014-15/+15
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: switch vcpu context to use SRCUMarcelo Tosatti2010-03-013-26/+30
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: convert io_bus to SRCUMarcelo Tosatti2010-03-013-10/+13
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: switch kvm_set_memory_alias to SRCU updateMarcelo Tosatti2010-03-011-9/+51
| | | | | | Using a similar two-step procedure as for memslots. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: use SRCU for dirty logMarcelo Tosatti2010-03-011-8/+41
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU updateMarcelo Tosatti2010-03-012-15/+19
| | | | | | | | | | Use two steps for memslot deletion: mark the slot invalid (which stops instantiation of new shadow pages for that slot, but allows destruction), then instantiate the new empty slot. Also simplifies kvm_handle_hva locking. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: split kvm_arch_set_memory_region into prepare and commitMarcelo Tosatti2010-03-011-22/+29
| | | | | | Required for SRCU convertion later. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: modify alias layout in x86s struct kvm_archMarcelo Tosatti2010-03-011-5/+16
| | | | | | Have a pointer to an allocated region inside x86's kvm_arch. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: modify memslots layout in struct kvmMarcelo Tosatti2010-03-013-9/+10
| | | | | | | | | Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Add KVM_MMIO kconfig itemAvi Kivity2010-03-011-0/+1
| | | | | | s390 doesn't have mmio, this will simplify ifdefing it out. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Adjust tsc_offset only if tsc_unstableJoerg Roedel2010-03-011-8/+10
| | | | | | | | | | | | | The tsc_offset adjustment in svm_vcpu_load is executed unconditionally even if Linux considers the host tsc as stable. This causes a Linux guest detecting an unstable tsc in any case. This patch removes the tsc_offset adjustment if the host tsc is stable. The guest will now get the benefit of a stable tsc too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Add instruction rdtscp support for guestSheng Yang2010-03-013-4/+66
| | | | | | | Before enabling, execution of "rdtscp" in guest would result in #UD. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Add cpuid_update() callback to kvm_x86_opsSheng Yang2010-03-013-0/+15
| | | | | | | | | | | Sometime, we need to adjust some state in order to reflect guest CPUID setting, e.g. if we don't expose rdtscp to guest, we won't want to enable it on hardware. cpuid_update() is introduced for this purpose. Also export kvm_find_cpuid_entry() for later use. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Extended shared_msr_global to per CPUSheng Yang2010-03-011-22/+33
| | | | | | | | | | | | | | | shared_msr_global saved host value of relevant MSRs, but it have an assumption that all MSRs it tracked shared the value across the different CPUs. It's not true with some MSRs, e.g. MSR_TSC_AUX. Extend it to per CPU to provide the support of MSR_TSC_AUX, and more alike MSRs. Notice now the shared_msr_global still have one assumption: it can only deal with the MSRs that won't change in host after KVM module loaded. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Remove redundant variableSheng Yang2010-03-011-2/+0
| | | | | | | It's no longer necessary. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fold ept_update_paging_mode_cr4() into its callerAvi Kivity2010-03-011-12/+8
| | | | | | | | | | ept_update_paging_mode_cr4() accesses vcpu->arch.cr4 directly, which usually needs to be accessed via kvm_read_cr4(). In this case, we can't, since cr4 is in the process of being updated. Instead of adding inane comments, fold the function into its caller (vmx_set_cr4), so it can use the not-yet-committed cr4 directly. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: When using ept, allow the guest to own cr4.pgeAvi Kivity2010-03-011-0/+2
| | | | | | | We make no use of cr4.pge if ept is enabled, but the guest does (to flush global mappings, as with vmap()), so give the guest ownership of this bit. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Make guest cr4 mask more conservativeAvi Kivity2010-03-011-4/+6
| | | | | | | | Instead of specifying the bits which we want to trap on, specify the bits which we allow the guest to change transparently. This is safer wrt future changes to cr4. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Add accessor for reading cr4 (or some bits of cr4)Avi Kivity2010-03-014-17/+29
| | | | | | | | | | | Some bits of cr4 can be owned by the guest on vmx, so when we read them, we copy them to the vcpu structure. In preparation for making the set of guest-owned bits dynamic, use helpers to access these bits so we don't need to know where the bit resides. No changes to svm since all bits are host-owned there. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Move some cr[04] related constants to vmx.cAvi Kivity2010-03-011-0/+13
| | | | | | They have no place in common code. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Trap and invalid MWAIT/MONITOR instructionSheng Yang2010-03-011-0/+10
| | | | | | | | | | We don't support these instructions, but guest can execute them even if the feature('monitor') haven't been exposed in CPUID. So we would trap and inject a #UD if guest try this way. Cc: stable@kernel.org Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Report spte not found in rmap before BUG()Avi Kivity2010-03-011-0/+1
| | | | | | | In the past we've had errors of single-bit in the other two cases; the printk() may confirm it for the third case (many->many). Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: raise TSS exception for NULL CS and SS segmentsMarcelo Tosatti2010-03-011-0/+11
| | | | | | | Windows 2003 uses task switch to triple fault and reboot (the other exception being reserved pdptrs bits). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: make double/triple fault promotion generic to all exceptionsEddie Dong2010-03-011-28/+61
| | | | | | | | | Move Double-Fault generation logic out of page fault exception generating function to cover more generic case. Signed-off-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: PIT: control word is write-onlyMarcelo Tosatti2010-02-091-0/+3
| | | | | | | PIT control word (address 0x43) is write-only, reads are undefined. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* kvmclock: count total_sleep_time when updating guest clockJason Wang2010-02-091-4/+3
| | | | | | | | | | | Current kvm wallclock does not consider the total_sleep_time which could cause wrong wallclock in guest after host suspend/resume. This patch solve this issue by counting total_sleep_time to get the correct host boot time. Cc: stable@kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()Wei Yongjun2010-01-251-2/+3
| | | | | | | | | | In function kvm_arch_vcpu_init(), if the memory malloc for vcpu->arch.mce_banks is fail, it does not free the memory of lapic date. This patch fixed it. Cc: stable@kernel.org Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Fix probable memory leak of vcpu->arch.mce_banksWei Yongjun2010-01-251-0/+1
| | | | | | | | | | vcpu->arch.mce_banks is malloc in kvm_arch_vcpu_init(), but never free in any place, this may cause memory leak. So this patch fixed to free it in kvm_arch_vcpu_uninit(). Cc: stable@kernel.org Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: MMU: bail out pagewalk on kvm_read_guest errorMarcelo Tosatti2010-01-251-1/+3
| | | | | | | | Exit the guest pagetable walk loop if reading gpte failed. Otherwise its possible to enter an endless loop processing the previous present pte. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Fix host_mapping_level()Sheng Yang2010-01-251-4/+2
| | | | | | | | | | When found a error hva, should not return PAGE_SIZE but the level... Also clean up the coding style of the following loop. Cc: stable@kernel.org Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Fix race between APIC TMR and IRRAvi Kivity2010-01-251-5/+6
| | | | | | | | | | | | | | | | When we queue an interrupt to the local apic, we set the IRR before the TMR. The vcpu can pick up the IRR and inject the interrupt before setting the TMR, and perhaps even EOI it, causing incorrect behaviour. The race is really insignificant since it can only occur on the first interrupt (usually following interrupts will not change TMR), but it's better closed than open. Fixed by reordering setting the TMR vs IRR. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Extend KVM_SET_VCPU_EVENTS with selective updatesJan Kiszka2009-12-271-4/+8
| | | | | | | | | | | User space may not want to overwrite asynchronously changing VCPU event states on write-back. So allow to skip nmi.pending and sipi_vector by setting corresponding bits in the flags field of kvm_vcpu_events. [avi: advertise the bits in KVM_GET_VCPU_EVENTS] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: LAPIC: make sure IRR bitmap is scanned after vm loadMarcelo Tosatti2009-12-271-0/+1
| | | | | | | | | | The vcpus are initialized with irr_pending set to false, but loading the LAPIC registers with pending IRR fails to reset the irr_pending variable. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: remove prefault from invlpg handlerMarcelo Tosatti2009-12-271-18/+0
| | | | | | | | | | | | | | | | | | The invlpg prefault optimization breaks Windows 2008 R2 occasionally. The visible effect is that the invlpg handler instantiates a pte which is, microseconds later, written with a different gfn by another vcpu. The OS could have other mechanisms to prevent a present translation from being used, which the hypervisor is unaware of. While the documentation states that the cpu is at liberty to prefetch tlb entries, it looks like this is not heeded, so remove tlb prefetch from invlpg. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-141-33/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
| * percpu: make percpu symbols in x86 uniqueTejun Heo2009-10-291-32/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates percpu related symbols in x86 such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/x86/kernel/cpu/common.c: rename local variable to avoid collision * arch/x86/kvm/svm.c: s/svm_data/sd/ for local variables to avoid collision * arch/x86/kernel/cpu/cpu_debug.c: s/cpu_arr/cpud_arr/ s/priv_arr/cpud_priv_arr/ s/cpu_priv_count/cpud_priv_count/ * arch/x86/kernel/cpu/intel_cacheinfo.c: s/cpuid4_info/ici_cpuid4_info/ s/cache_kobject/ici_cache_kobject/ s/index_kobject/ici_index_kobject/ * arch/x86/kernel/ds.c: s/cpu_context/cpu_ds_context/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: (kvm) Avi Kivity <avi@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: x86@kernel.org
OpenPOWER on IntegriCloud