From 3a4d5c94e959359ece6d6b55045c3f046677f55c Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 14 Jan 2010 06:17:27 +0000 Subject: vhost_net: a kernel-level virtio server What it is: vhost net is a character device that can be used to reduce the number of system calls involved in virtio networking. Existing virtio net code is used in the guest without modification. There's similarity with vringfd, with some differences and reduced scope - uses eventfd for signalling - structures can be moved around in memory at any time (good for migration, bug work-arounds in userspace) - write logging is supported (good for migration) - support memory table and not just an offset (needed for kvm) common virtio related code has been put in a separate file vhost.c and can be made into a separate module if/when more backends appear. I used Rusty's lguest.c as the source for developing this part : this supplied me with witty comments I wouldn't be able to write myself. What it is not: vhost net is not a bus, and not a generic new system call. No assumptions are made on how guest performs hypercalls. Userspace hypervisors are supported as well as kvm. How it works: Basically, we connect virtio frontend (configured by userspace) to a backend. The backend could be a network device, or a tap device. Backend is also configured by userspace, including vlan/mac etc. Status: This works for me, and I haven't see any crashes. Compared to userspace, people reported improved latency (as I save up to 4 system calls per packet), as well as better bandwidth and CPU utilization. Features that I plan to look at in the future: - mergeable buffers - zero copy - scalability tuning: figure out the best threading model to use Note on RCU usage (this is also documented in vhost.h, near private_pointer which is the value protected by this variant of RCU): what is happening is that the rcu_dereference() is being used in a workqueue item. The role of rcu_read_lock() is taken on by the start of execution of the workqueue item, of rcu_read_unlock() by the end of execution of the workqueue item, and of synchronize_rcu() by flush_workqueue()/flush_work(). In the future we might need to apply some gcc attribute or sparse annotation to the function passed to INIT_WORK(). Paul's ack below is for this RCU usage. (Includes fixes by Alan Cox , David L Stevens , Chris Wright ) Acked-by: Rusty Russell Acked-by: Arnd Bergmann Acked-by: "Paul E. McKenney" Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller --- arch/ia64/kvm/Kconfig | 1 + 1 file changed, 1 insertion(+) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/Kconfig b/arch/ia64/kvm/Kconfig index ef3e7be29caf..01c75797119c 100644 --- a/arch/ia64/kvm/Kconfig +++ b/arch/ia64/kvm/Kconfig @@ -47,6 +47,7 @@ config KVM_INTEL Provides support for KVM on Itanium 2 processors equipped with the VT extensions. +source drivers/vhost/Kconfig source drivers/virtio/Kconfig endif # VIRTUALIZATION -- cgit v1.2.1 From 50eb2a3cd0f50d912b26d0b79b7f443344608390 Mon Sep 17 00:00:00 2001 From: Avi Kivity Date: Sun, 20 Dec 2009 15:00:10 +0200 Subject: KVM: Add KVM_MMIO kconfig item s390 doesn't have mmio, this will simplify ifdefing it out. Signed-off-by: Avi Kivity --- arch/ia64/kvm/Kconfig | 1 + 1 file changed, 1 insertion(+) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/Kconfig b/arch/ia64/kvm/Kconfig index ef3e7be29caf..bf82e47c98bb 100644 --- a/arch/ia64/kvm/Kconfig +++ b/arch/ia64/kvm/Kconfig @@ -26,6 +26,7 @@ config KVM select ANON_INODES select HAVE_KVM_IRQCHIP select KVM_APIC_ARCHITECTURE + select KVM_MMIO ---help--- Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent -- cgit v1.2.1 From 46a26bf55714c1e2f17e34683292a389acb8e601 Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:16 -0200 Subject: KVM: modify memslots layout in struct kvm Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by: Alexander Graf Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 5fdeec5fddcf..1ca1dbf48117 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1377,12 +1377,14 @@ static void free_kvm(struct kvm *kvm) static void kvm_release_vm_pages(struct kvm *kvm) { + struct kvm_memslots *slots; struct kvm_memory_slot *memslot; int i, j; unsigned long base_gfn; - for (i = 0; i < kvm->nmemslots; i++) { - memslot = &kvm->memslots[i]; + slots = kvm->memslots; + for (i = 0; i < slots->nmemslots; i++) { + memslot = &slots->memslots[i]; base_gfn = memslot->base_gfn; for (j = 0; j < memslot->npages; j++) { @@ -1802,7 +1804,7 @@ static int kvm_ia64_sync_dirty_log(struct kvm *kvm, if (log->slot >= KVM_MEMORY_SLOTS) goto out; - memslot = &kvm->memslots[log->slot]; + memslot = &kvm->memslots->memslots[log->slot]; r = -ENOENT; if (!memslot->dirty_bitmap) goto out; @@ -1840,7 +1842,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, /* If nothing is dirty, don't bother messing with page tables. */ if (is_dirty) { kvm_flush_remote_tlbs(kvm); - memslot = &kvm->memslots[log->slot]; + memslot = &kvm->memslots->memslots[log->slot]; n = ALIGN(memslot->npages, BITS_PER_LONG) / 8; memset(memslot->dirty_bitmap, 0, n); } -- cgit v1.2.1 From f7784b8ec9b6a041fa828cfbe9012fe51933f5ac Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:18 -0200 Subject: KVM: split kvm_arch_set_memory_region into prepare and commit Required for SRCU convertion later. Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 1ca1dbf48117..0757c7027986 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1578,15 +1578,15 @@ out: return r; } -int kvm_arch_set_memory_region(struct kvm *kvm, - struct kvm_userspace_memory_region *mem, +int kvm_arch_prepare_memory_region(struct kvm *kvm, + struct kvm_memory_slot *memslot, struct kvm_memory_slot old, + struct kvm_userspace_memory_region *mem, int user_alloc) { unsigned long i; unsigned long pfn; - int npages = mem->memory_size >> PAGE_SHIFT; - struct kvm_memory_slot *memslot = &kvm->memslots[mem->slot]; + int npages = memslot->npages; unsigned long base_gfn = memslot->base_gfn; if (base_gfn + npages > (KVM_MAX_MEM_SIZE >> PAGE_SHIFT)) @@ -1610,6 +1610,14 @@ int kvm_arch_set_memory_region(struct kvm *kvm, return 0; } +void kvm_arch_commit_memory_region(struct kvm *kvm, + struct kvm_userspace_memory_region *mem, + struct kvm_memory_slot old, + int user_alloc) +{ + return; +} + void kvm_arch_flush_shadow(struct kvm *kvm) { kvm_flush_remote_tlbs(kvm); -- cgit v1.2.1 From bc6678a33d9b952981a8e44a4f876c3ad64ca4d8 Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:21 -0200 Subject: KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update Use two steps for memslot deletion: mark the slot invalid (which stops instantiation of new shadow pages for that slot, but allows destruction), then instantiate the new empty slot. Also simplifies kvm_handle_hva locking. Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 0757c7027986..b2e4d16dd39e 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1382,7 +1382,7 @@ static void kvm_release_vm_pages(struct kvm *kvm) int i, j; unsigned long base_gfn; - slots = kvm->memslots; + slots = rcu_dereference(kvm->memslots); for (i = 0; i < slots->nmemslots; i++) { memslot = &slots->memslots[i]; base_gfn = memslot->base_gfn; @@ -1837,6 +1837,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot; int is_dirty = 0; + down_write(&kvm->slots_lock); spin_lock(&kvm->arch.dirty_log_lock); r = kvm_ia64_sync_dirty_log(kvm, log); @@ -1856,6 +1857,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, } r = 0; out: + up_write(&kvm->slots_lock); spin_unlock(&kvm->arch.dirty_log_lock); return r; } -- cgit v1.2.1 From e93f8a0f821e290ac5149830110a5f704db7a1fc Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:24 -0200 Subject: KVM: convert io_bus to SRCU Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index b2e4d16dd39e..d0ad538f0083 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -241,10 +241,10 @@ static int handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 0; mmio: if (p->dir) - r = kvm_io_bus_read(&vcpu->kvm->mmio_bus, p->addr, + r = kvm_io_bus_read(vcpu->kvm, KVM_MMIO_BUS, p->addr, p->size, &p->data); else - r = kvm_io_bus_write(&vcpu->kvm->mmio_bus, p->addr, + r = kvm_io_bus_write(vcpu->kvm, KVM_MMIO_BUS, p->addr, p->size, &p->data); if (r) printk(KERN_ERR"kvm: No iodevice found! addr:%lx\n", p->addr); -- cgit v1.2.1 From f656ce0185cabbbb0cf96877306879661297c7ad Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:25 -0200 Subject: KVM: switch vcpu context to use SRCU Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index d0ad538f0083..d5e384641275 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -636,12 +636,9 @@ static void kvm_vcpu_post_transition(struct kvm_vcpu *vcpu) static int __vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { union context *host_ctx, *guest_ctx; - int r; + int r, idx; - /* - * down_read() may sleep and return with interrupts enabled - */ - down_read(&vcpu->kvm->slots_lock); + idx = srcu_read_lock(&vcpu->kvm->srcu); again: if (signal_pending(current)) { @@ -663,7 +660,7 @@ again: if (r < 0) goto vcpu_run_fail; - up_read(&vcpu->kvm->slots_lock); + srcu_read_unlock(&vcpu->kvm->srcu, idx); kvm_guest_enter(); /* @@ -687,7 +684,7 @@ again: kvm_guest_exit(); preempt_enable(); - down_read(&vcpu->kvm->slots_lock); + idx = srcu_read_lock(&vcpu->kvm->srcu); r = kvm_handle_exit(kvm_run, vcpu); @@ -697,10 +694,10 @@ again: } out: - up_read(&vcpu->kvm->slots_lock); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (r > 0) { kvm_resched(vcpu); - down_read(&vcpu->kvm->slots_lock); + idx = srcu_read_lock(&vcpu->kvm->srcu); goto again; } -- cgit v1.2.1 From 79fac95ecfa3969aab8119d37ccd7226165f933a Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Wed, 23 Dec 2009 14:35:26 -0200 Subject: KVM: convert slots_lock to a mutex Signed-off-by: Marcelo Tosatti --- arch/ia64/kvm/kvm-ia64.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index d5e384641275..e6ac549f8d55 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1834,7 +1834,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot; int is_dirty = 0; - down_write(&kvm->slots_lock); + mutex_lock(&kvm->slots_lock); spin_lock(&kvm->arch.dirty_log_lock); r = kvm_ia64_sync_dirty_log(kvm, log); @@ -1854,7 +1854,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, } r = 0; out: - up_write(&kvm->slots_lock); + mutex_unlock(&kvm->slots_lock); spin_unlock(&kvm->arch.dirty_log_lock); return r; } -- cgit v1.2.1 From 0f0412c1a734700fb8ccccb1c09371642e029e2e Mon Sep 17 00:00:00 2001 From: Roel Kluin Date: Thu, 14 Jan 2010 18:05:58 +0100 Subject: KVM: ia64: remove redundant kvm_get_exit_data() NULL tests kvm_get_exit_data() cannot return a NULL pointer. Signed-off-by: Roel Kluin Signed-off-by: Avi Kivity --- arch/ia64/kvm/kvm_fw.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm_fw.c b/arch/ia64/kvm/kvm_fw.c index e4b82319881d..cb548ee9fcae 100644 --- a/arch/ia64/kvm/kvm_fw.c +++ b/arch/ia64/kvm/kvm_fw.c @@ -75,7 +75,7 @@ static void set_pal_result(struct kvm_vcpu *vcpu, struct exit_ctl_data *p; p = kvm_get_exit_data(vcpu); - if (p && p->exit_reason == EXIT_REASON_PAL_CALL) { + if (p->exit_reason == EXIT_REASON_PAL_CALL) { p->u.pal_data.ret = result; return ; } @@ -87,7 +87,7 @@ static void set_sal_result(struct kvm_vcpu *vcpu, struct exit_ctl_data *p; p = kvm_get_exit_data(vcpu); - if (p && p->exit_reason == EXIT_REASON_SAL_CALL) { + if (p->exit_reason == EXIT_REASON_SAL_CALL) { p->u.sal_data.ret = result; return ; } @@ -322,7 +322,7 @@ static u64 kvm_get_pal_call_index(struct kvm_vcpu *vcpu) struct exit_ctl_data *p; p = kvm_get_exit_data(vcpu); - if (p && (p->exit_reason == EXIT_REASON_PAL_CALL)) + if (p->exit_reason == EXIT_REASON_PAL_CALL) index = p->u.pal_data.gr28; return index; @@ -646,18 +646,16 @@ static void kvm_get_sal_call_data(struct kvm_vcpu *vcpu, u64 *in0, u64 *in1, p = kvm_get_exit_data(vcpu); - if (p) { - if (p->exit_reason == EXIT_REASON_SAL_CALL) { - *in0 = p->u.sal_data.in0; - *in1 = p->u.sal_data.in1; - *in2 = p->u.sal_data.in2; - *in3 = p->u.sal_data.in3; - *in4 = p->u.sal_data.in4; - *in5 = p->u.sal_data.in5; - *in6 = p->u.sal_data.in6; - *in7 = p->u.sal_data.in7; - return ; - } + if (p->exit_reason == EXIT_REASON_SAL_CALL) { + *in0 = p->u.sal_data.in0; + *in1 = p->u.sal_data.in1; + *in2 = p->u.sal_data.in2; + *in3 = p->u.sal_data.in3; + *in4 = p->u.sal_data.in4; + *in5 = p->u.sal_data.in5; + *in6 = p->u.sal_data.in6; + *in7 = p->u.sal_data.in7; + return ; } *in0 = 0; } -- cgit v1.2.1 From 647492047763c3ee8fe51ecf9a04f39040aa495b Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Tue, 19 Jan 2010 12:45:23 -0200 Subject: KVM: fix cleanup_srcu_struct on vm destruction cleanup_srcu_struct on VM destruction remains broken: BUG: unable to handle kernel paging request at ffffffffffffffff IP: [] srcu_read_lock+0x16/0x21 RIP: 0010:[] [] srcu_read_lock+0x16/0x21 Call Trace: [] kvm_arch_vcpu_uninit+0x1b/0x48 [kvm] [] kvm_vcpu_uninit+0x9/0x15 [kvm] [] vmx_free_vcpu+0x7f/0x8f [kvm_intel] [] kvm_arch_destroy_vm+0x78/0x111 [kvm] [] kvm_put_kvm+0xd4/0xfe [kvm] Move it to kvm_arch_destroy_vm. Signed-off-by: Marcelo Tosatti Reported-by: Jan Kiszka --- arch/ia64/kvm/kvm-ia64.c | 1 + 1 file changed, 1 insertion(+) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index e6ac549f8d55..06188988ed27 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1404,6 +1404,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kfree(kvm->arch.vioapic); kvm_release_vm_pages(kvm); kvm_free_physmem(kvm); + cleanup_srcu_struct(&kvm->srcu); free_kvm(kvm); } -- cgit v1.2.1 From 16fbb5eecc4ab8d4eb4d2a683d7fd74ccacc890b Mon Sep 17 00:00:00 2001 From: Joe Perches Date: Mon, 1 Feb 2010 23:22:07 -0800 Subject: KVM: ia64: Fix string literal continuation lines String constants that are continued on subsequent lines with \ are not good. Signed-off-by: Joe Perches Signed-off-by: Avi Kivity --- arch/ia64/kvm/mmio.c | 4 ++-- arch/ia64/kvm/vcpu.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/mmio.c b/arch/ia64/kvm/mmio.c index 9bf55afd08d0..fb8f9f59a1ed 100644 --- a/arch/ia64/kvm/mmio.c +++ b/arch/ia64/kvm/mmio.c @@ -316,8 +316,8 @@ void emulate_io_inst(struct kvm_vcpu *vcpu, u64 padr, u64 ma) return; } else { inst_type = -1; - panic_vm(vcpu, "Unsupported MMIO access instruction! \ - Bunld[0]=0x%lx, Bundle[1]=0x%lx\n", + panic_vm(vcpu, "Unsupported MMIO access instruction! " + "Bunld[0]=0x%lx, Bundle[1]=0x%lx\n", bundle.i64[0], bundle.i64[1]); } diff --git a/arch/ia64/kvm/vcpu.c b/arch/ia64/kvm/vcpu.c index dce75b70cdd5..958815c9787d 100644 --- a/arch/ia64/kvm/vcpu.c +++ b/arch/ia64/kvm/vcpu.c @@ -1639,8 +1639,8 @@ void vcpu_set_psr(struct kvm_vcpu *vcpu, unsigned long val) * Otherwise panic */ if (val & (IA64_PSR_PK | IA64_PSR_IS | IA64_PSR_VM)) - panic_vm(vcpu, "Only support guests with vpsr.pk =0 \ - & vpsr.is=0\n"); + panic_vm(vcpu, "Only support guests with vpsr.pk =0 " + "& vpsr.is=0\n"); /* * For those IA64_PSR bits: id/da/dd/ss/ed/ia -- cgit v1.2.1 From 4b7bb9210047fe880bb71e6273c3a4526799dbd7 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Tue, 9 Feb 2010 10:41:56 +0800 Subject: KVM: ia64: destroy ioapic device if fail to setup default irq routing If KVM_CREATE_IRQCHIP fail due to kvm_setup_default_irq_routing(), ioapic device is not destroyed and kvm->arch.vioapic is not set to NULL, this may cause KVM_GET_IRQCHIP and KVM_SET_IRQCHIP access to unexcepted memory. Signed-off-by: Wei Yongjun Signed-off-by: Avi Kivity --- arch/ia64/kvm/kvm-ia64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/ia64/kvm') diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 06188988ed27..26e0e089bfe7 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -968,7 +968,7 @@ long kvm_arch_vm_ioctl(struct file *filp, goto out; r = kvm_setup_default_irq_routing(kvm); if (r) { - kfree(kvm->arch.vioapic); + kvm_ioapic_destroy(kvm); goto out; } break; -- cgit v1.2.1