From 51c71a3bbaca868043cc45b3ad3786dd48a90235 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Tue, 26 Nov 2013 15:05:40 -0500 Subject: xen/pvhvm: If xen_platform_pci=0 is set don't blow up (v4). The user has the option of disabling the platform driver: 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) which is used to unplug the emulated drivers (IDE, Realtek 8169, etc) and allow the PV drivers to take over. If the user wishes to disable that they can set: xen_platform_pci=0 (in the guest config file) or xen_emul_unplug=never (on the Linux command line) except it does not work properly. The PV drivers still try to load and since the Xen platform driver is not run - and it has not initialized the grant tables, most of the PV drivers stumble upon: input: Xen Virtual Keyboard as /devices/virtual/input/input5 input: Xen Virtual Pointer as /devices/virtual/input/input6M ------------[ cut here ]------------ kernel BUG at /home/konrad/ssd/konrad/linux/drivers/xen/grant-table.c:1206! invalid opcode: 0000 [#1] SMP Modules linked in: xen_kbdfront(+) xenfs xen_privcmd CPU: 6 PID: 1389 Comm: modprobe Not tainted 3.13.0-rc1upstream-00021-ga6c892b-dirty #1 Hardware name: Xen HVM domU, BIOS 4.4-unstable 11/26/2013 RIP: 0010:[] [] get_free_entries+0x2e0/0x300 Call Trace: [] ? evdev_connect+0x1e3/0x240 [] gnttab_grant_foreign_access+0x2e/0x70 [] xenkbd_connect_backend+0x41/0x290 [xen_kbdfront] [] xenkbd_probe+0x2f2/0x324 [xen_kbdfront] [] xenbus_dev_probe+0x77/0x130 [] xenbus_frontend_dev_probe+0x47/0x50 [] driver_probe_device+0x89/0x230 [] __driver_attach+0x9b/0xa0 [] ? driver_probe_device+0x230/0x230 [] ? driver_probe_device+0x230/0x230 [] bus_for_each_dev+0x8c/0xb0 [] driver_attach+0x19/0x20 [] bus_add_driver+0x1a0/0x220 [] driver_register+0x5f/0xf0 [] xenbus_register_driver_common+0x15/0x20 [] xenbus_register_frontend+0x23/0x40 [] ? 0xffffffffa0014fff [] xenkbd_init+0x2b/0x1000 [xen_kbdfront] [] do_one_initcall+0x49/0x170 .. snip.. which is hardly nice. This patch fixes this by having each PV driver check for: - if running in PV, then it is fine to execute (as that is their native environment). - if running in HVM, check if user wanted 'xen_emul_unplug=never', in which case bail out and don't load any PV drivers. - if running in HVM, and if PCI device 5853:0001 (xen_platform_pci) does not exist, then bail out and not load PV drivers. - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=ide-disks', then bail out for all PV devices _except_ the block one. Ditto for the network one ('nics'). - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=unnecessary' then load block PV driver, and also setup the legacy IDE paths. In (v3) make it actually load PV drivers. Reported-by: Sander Eikelenboom Reported-and-Tested-by: Fabio Fantoni Signed-off-by: Konrad Rzeszutek Wilk [v2: Add extra logic to handle the myrid ways 'xen_emul_unplug' can be used per Ian and Stefano suggestion] [v3: Make the unnecessary case work properly] [v4: s/disks/ide-disks/ spotted by Fabio] Reviewed-by: Stefano Stabellini Acked-by: Bjorn Helgaas [for PCI parts] CC: stable@vger.kernel.org --- drivers/xen/xenbus/xenbus_probe_frontend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c b/drivers/xen/xenbus/xenbus_probe_frontend.c index 129bf84c19ec..cb385c10d2b1 100644 --- a/drivers/xen/xenbus/xenbus_probe_frontend.c +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c @@ -496,7 +496,7 @@ subsys_initcall(xenbus_probe_frontend_init); #ifndef MODULE static int __init boot_wait_for_devices(void) { - if (xen_hvm_domain() && !xen_platform_pci_unplug) + if (!xen_has_pv_devices()) return -ENODEV; ready_to_wait_for_devices = 1; -- cgit v1.2.1 From 72f28071f14fd9b6cc03aaf83b057d169d817411 Mon Sep 17 00:00:00 2001 From: Ian Campbell Date: Wed, 11 Dec 2013 12:03:17 +0000 Subject: xen: balloon: enable for ARM Since c275a57f5ec3 "xen/balloon: Set balloon's initial state to number of existing RAM pages" the balloon driver appears to work fine on ARM as far as I can tell. Prior to that commit it was broken because on ARM RAM doesn't typically start at zero, effectively leaving a big MMIO hole at the start. This would cause the balloon driver to give away all of RAM at start of day, which is rather inconvenient. It was already enabled (or rather not excluded) on ARM64. The c1d15f5c8bc1170dafe16e988e55437245966dfe "xen/balloon: Seperate the auto-translate logic properly (v2)" added in the proper plumbing to work with ARM and PVH type guests. Signed-off-by: Ian Campbell Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Acked-by: Stefano Stabellini [v2: Added the bit about PVH] Signed-off-by: Konrad Rzeszutek Wilk --- drivers/xen/Kconfig | 1 - 1 file changed, 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index c794ea182140..b9ea2abe5628 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -3,7 +3,6 @@ menu "Xen driver support" config XEN_BALLOON bool "Xen memory balloon driver" - depends on !ARM default y help The balloon driver allows the Xen domain to request more memory from -- cgit v1.2.1 From 9346c2a8defab777d1fba6bcc284f6ada181fe96 Mon Sep 17 00:00:00 2001 From: Jie Liu Date: Wed, 13 Nov 2013 20:59:58 +0800 Subject: xen: simplify balloon_first_page() with list_first_entry_or_null() Replace the code logic at balloon_first_page() by calling list_first_entry_or_null() directly. since here is only one user of that routine, therefore we can just remove it. Signed-off-by: Jie Liu Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel --- drivers/xen/balloon.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 4c02e2b94103..37d06ea624aa 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -157,13 +157,6 @@ static struct page *balloon_retrieve(bool prefer_highmem) return page; } -static struct page *balloon_first_page(void) -{ - if (list_empty(&ballooned_pages)) - return NULL; - return list_entry(ballooned_pages.next, struct page, lru); -} - static struct page *balloon_next_page(struct page *page) { struct list_head *next = page->lru.next; @@ -328,7 +321,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages) if (nr_pages > ARRAY_SIZE(frame_list)) nr_pages = ARRAY_SIZE(frame_list); - page = balloon_first_page(); + page = list_first_entry_or_null(&ballooned_pages, struct page, lru); for (i = 0; i < nr_pages; i++) { if (!page) { nr_pages = i; -- cgit v1.2.1 From b7ef4a6dd35d1b47db72fbd1a31c8fd0da7a74f3 Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Tue, 31 Dec 2013 20:46:27 +0100 Subject: xen/pci: Fix build on non-x86 We can't include if this isn't x86, and we only need it if CONFIG_PCI_MMCONFIG is enabled. Fixes: 8deb3eb1461e ('xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas.') Signed-off-by: Ben Hutchings Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel Acked-by: Ian Campbell --- drivers/xen/pci.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'drivers/xen') diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c index 188825122aae..dd9c249ea311 100644 --- a/drivers/xen/pci.c +++ b/drivers/xen/pci.c @@ -26,7 +26,9 @@ #include #include #include "../pci/pci.h" +#ifdef CONFIG_PCI_MMCONFIG #include +#endif static bool __read_mostly pci_seg_supported = true; -- cgit v1.2.1 From 872951850666689e931e567ebdc7c483135d14cf Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Tue, 12 Mar 2013 18:28:04 +0000 Subject: xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn() These two function did the same thing with different parameters, put the common bits in retrigger_evtchn(). This changes the return value of resend_irq_on_evtchn() but the only caller (in arch/ia64/xen/irq_xen.c) ignored the return value so this is fine. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events.c | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 4035e833ea26..ddcdbb508dab 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -1558,13 +1558,13 @@ static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, return rebind_irq_to_cpu(data->irq, tcpu); } -int resend_irq_on_evtchn(unsigned int irq) +static int retrigger_evtchn(int evtchn) { - int masked, evtchn = evtchn_from_irq(irq); + int masked; struct shared_info *s = HYPERVISOR_shared_info; if (!VALID_EVTCHN(evtchn)) - return 1; + return 0; masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask)); sync_set_bit(evtchn, BM(s->evtchn_pending)); @@ -1574,6 +1574,11 @@ int resend_irq_on_evtchn(unsigned int irq) return 1; } +int resend_irq_on_evtchn(unsigned int irq) +{ + return retrigger_evtchn(evtchn_from_irq(irq)); +} + static void enable_dynirq(struct irq_data *data) { int evtchn = evtchn_from_irq(data->irq); @@ -1608,21 +1613,7 @@ static void mask_ack_dynirq(struct irq_data *data) static int retrigger_dynirq(struct irq_data *data) { - int evtchn = evtchn_from_irq(data->irq); - struct shared_info *sh = HYPERVISOR_shared_info; - int ret = 0; - - if (VALID_EVTCHN(evtchn)) { - int masked; - - masked = sync_test_and_set_bit(evtchn, BM(sh->evtchn_mask)); - sync_set_bit(evtchn, BM(sh->evtchn_pending)); - if (!masked) - unmask_evtchn(evtchn); - ret = 1; - } - - return ret; + return retrigger_evtchn(evtchn_from_irq(data->irq)); } static void restore_pirqs(void) -- cgit v1.2.1 From fc087e10734a4d3e40693fc099461ec1270b3fff Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Wed, 13 Mar 2013 13:20:52 +0000 Subject: xen/events: remove unnecessary init_evtchn_cpu_bindings() Because the guest-side binding of an event to a VCPU (i.e., setting the local per-cpu masks) is always explicitly done after an event channel is bound to a port, there is no need to initialize all possible events as bound to VCPU 0 at start of day or after a resume. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events.c | 22 ---------------------- 1 file changed, 22 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events.c b/drivers/xen/events.c index ddcdbb508dab..1e2c74bcd0c8 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -334,24 +334,6 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) info_for_irq(irq)->cpu = cpu; } -static void init_evtchn_cpu_bindings(void) -{ - int i; -#ifdef CONFIG_SMP - struct irq_info *info; - - /* By default all event channels notify CPU#0. */ - list_for_each_entry(info, &xen_irq_list_head, list) { - struct irq_desc *desc = irq_to_desc(info->irq); - cpumask_copy(desc->irq_data.affinity, cpumask_of(0)); - } -#endif - - for_each_possible_cpu(i) - memset(per_cpu(cpu_evtchn_mask, i), - (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8); -} - static inline void clear_evtchn(int port) { struct shared_info *s = HYPERVISOR_shared_info; @@ -1778,8 +1760,6 @@ void xen_irq_resume(void) unsigned int cpu, evtchn; struct irq_info *info; - init_evtchn_cpu_bindings(); - /* New event-channel space is not 'live' yet. */ for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) mask_evtchn(evtchn); @@ -1890,8 +1870,6 @@ void __init xen_init_IRQ(void) for (i = 0; i < NR_EVENT_CHANNELS; i++) evtchn_to_irq[i] = -1; - init_evtchn_cpu_bindings(); - /* No event channels are 'live' right now. */ for (i = 0; i < NR_EVENT_CHANNELS; i++) mask_evtchn(i); -- cgit v1.2.1 From 3f70fa828249e3f37883be98f5b4d08e947f55b0 Mon Sep 17 00:00:00 2001 From: Wei Liu Date: Thu, 7 Mar 2013 15:50:27 +0000 Subject: xen/events: introduce test_and_set_mask() In preparation for adding event channel port ops, add test_and_set_mask(). Signed-off-by: Wei Liu Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 1e2c74bcd0c8..359e983d97e4 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -352,6 +352,12 @@ static inline int test_evtchn(int port) return sync_test_bit(port, BM(&s->evtchn_pending[0])); } +static inline int test_and_set_mask(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); +} + /** * notify_remote_via_irq - send event to remote end of event channel via irq @@ -1493,7 +1499,6 @@ void rebind_evtchn_irq(int evtchn, int irq) /* Rebind an evtchn so that it gets delivered to a specific cpu */ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu) { - struct shared_info *s = HYPERVISOR_shared_info; struct evtchn_bind_vcpu bind_vcpu; int evtchn = evtchn_from_irq(irq); int masked; @@ -1516,7 +1521,7 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu) * Mask the event while changing the VCPU binding to prevent * it being delivered on an unexpected VCPU. */ - masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask)); + masked = test_and_set_mask(evtchn); /* * If this fails, it usually just indicates that we're dealing with a @@ -1548,7 +1553,7 @@ static int retrigger_evtchn(int evtchn) if (!VALID_EVTCHN(evtchn)) return 0; - masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask)); + masked = test_and_set_mask(evtchn); sync_set_bit(evtchn, BM(s->evtchn_pending)); if (!masked) unmask_evtchn(evtchn); -- cgit v1.2.1 From 76ec8d64ce50acc8a159740b08a721b7259f9ae7 Mon Sep 17 00:00:00 2001 From: Wei Liu Date: Thu, 7 Mar 2013 15:50:28 +0000 Subject: xen/events: replace raw bit ops with functions In preparation for adding event channel port ops, use set_evtchn() instead of sync_set_bit(). Signed-off-by: Wei Liu Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 359e983d97e4..fec5da4ff3a0 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -1548,13 +1548,12 @@ static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, static int retrigger_evtchn(int evtchn) { int masked; - struct shared_info *s = HYPERVISOR_shared_info; if (!VALID_EVTCHN(evtchn)) return 0; masked = test_and_set_mask(evtchn); - sync_set_bit(evtchn, BM(s->evtchn_pending)); + set_evtchn(evtchn); if (!masked) unmask_evtchn(evtchn); -- cgit v1.2.1 From d2ba3166f23baa53f5ee9c5c2ca43b42fb4e9e62 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Wed, 7 Aug 2013 14:32:12 +0100 Subject: xen/events: move drivers/xen/events.c into drivers/xen/events/ events.c will be split into multiple files so move it into its own directory. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/Makefile | 3 +- drivers/xen/events.c | 1908 -------------------------------------- drivers/xen/events/Makefile | 3 + drivers/xen/events/events_base.c | 1908 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 1913 insertions(+), 1909 deletions(-) delete mode 100644 drivers/xen/events.c create mode 100644 drivers/xen/events/Makefile create mode 100644 drivers/xen/events/events_base.c (limited to 'drivers/xen') diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index 14fe79d8634a..d75c811bfa56 100644 --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -2,7 +2,8 @@ ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),) obj-$(CONFIG_HOTPLUG_CPU) += cpu_hotplug.o endif obj-$(CONFIG_X86) += fallback.o -obj-y += grant-table.o features.o events.o balloon.o manage.o +obj-y += grant-table.o features.o balloon.o manage.o +obj-y += events/ obj-y += xenbus/ nostackp := $(call cc-option, -fno-stack-protector) diff --git a/drivers/xen/events.c b/drivers/xen/events.c deleted file mode 100644 index fec5da4ff3a0..000000000000 --- a/drivers/xen/events.c +++ /dev/null @@ -1,1908 +0,0 @@ -/* - * Xen event channels - * - * Xen models interrupts with abstract event channels. Because each - * domain gets 1024 event channels, but NR_IRQ is not that large, we - * must dynamically map irqs<->event channels. The event channels - * interface with the rest of the kernel by defining a xen interrupt - * chip. When an event is received, it is mapped to an irq and sent - * through the normal interrupt processing path. - * - * There are four kinds of events which can be mapped to an event - * channel: - * - * 1. Inter-domain notifications. This includes all the virtual - * device events, since they're driven by front-ends in another domain - * (typically dom0). - * 2. VIRQs, typically used for timers. These are per-cpu events. - * 3. IPIs. - * 4. PIRQs - Hardware interrupts. - * - * Jeremy Fitzhardinge , XenSource Inc, 2007 - */ - -#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef CONFIG_X86 -#include -#include -#include -#include -#include -#include -#include -#endif -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* - * This lock protects updates to the following mapping and reference-count - * arrays. The lock does not need to be acquired to read the mapping tables. - */ -static DEFINE_MUTEX(irq_mapping_update_lock); - -static LIST_HEAD(xen_irq_list_head); - -/* IRQ <-> VIRQ mapping. */ -static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; - -/* IRQ <-> IPI mapping */ -static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1}; - -/* Interrupt types. */ -enum xen_irq_type { - IRQT_UNBOUND = 0, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_EVTCHN -}; - -/* - * Packed IRQ information: - * type - enum xen_irq_type - * event channel - irq->event channel mapping - * cpu - cpu this event channel is bound to - * index - type-specific information: - * PIRQ - physical IRQ, GSI, flags, and owner domain - * VIRQ - virq number - * IPI - IPI vector - * EVTCHN - - */ -struct irq_info { - struct list_head list; - int refcnt; - enum xen_irq_type type; /* type */ - unsigned irq; - unsigned short evtchn; /* event channel */ - unsigned short cpu; /* cpu bound */ - - union { - unsigned short virq; - enum ipi_vector ipi; - struct { - unsigned short pirq; - unsigned short gsi; - unsigned char flags; - uint16_t domid; - } pirq; - } u; -}; -#define PIRQ_NEEDS_EOI (1 << 0) -#define PIRQ_SHAREABLE (1 << 1) - -static int *evtchn_to_irq; -#ifdef CONFIG_X86 -static unsigned long *pirq_eoi_map; -#endif -static bool (*pirq_needs_eoi)(unsigned irq); - -/* - * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be - * careful to only use bitops which allow for this (e.g - * test_bit/find_first_bit and friends but not __ffs) and to pass - * BITS_PER_EVTCHN_WORD as the bitmask length. - */ -#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8) -/* - * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t - * array. Primarily to avoid long lines (hence the terse name). - */ -#define BM(x) (unsigned long *)(x) -/* Find the first set bit in a evtchn mask */ -#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) - -static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], - cpu_evtchn_mask); - -/* Xen will never allocate port zero for any purpose. */ -#define VALID_EVTCHN(chn) ((chn) != 0) - -static struct irq_chip xen_dynamic_chip; -static struct irq_chip xen_percpu_chip; -static struct irq_chip xen_pirq_chip; -static void enable_dynirq(struct irq_data *data); -static void disable_dynirq(struct irq_data *data); - -/* Get info for IRQ */ -static struct irq_info *info_for_irq(unsigned irq) -{ - return irq_get_handler_data(irq); -} - -/* Constructors for packed IRQ information. */ -static void xen_irq_info_common_init(struct irq_info *info, - unsigned irq, - enum xen_irq_type type, - unsigned short evtchn, - unsigned short cpu) -{ - - BUG_ON(info->type != IRQT_UNBOUND && info->type != type); - - info->type = type; - info->irq = irq; - info->evtchn = evtchn; - info->cpu = cpu; - - evtchn_to_irq[evtchn] = irq; - - irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN); -} - -static void xen_irq_info_evtchn_init(unsigned irq, - unsigned short evtchn) -{ - struct irq_info *info = info_for_irq(irq); - - xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0); -} - -static void xen_irq_info_ipi_init(unsigned cpu, - unsigned irq, - unsigned short evtchn, - enum ipi_vector ipi) -{ - struct irq_info *info = info_for_irq(irq); - - xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0); - - info->u.ipi = ipi; - - per_cpu(ipi_to_irq, cpu)[ipi] = irq; -} - -static void xen_irq_info_virq_init(unsigned cpu, - unsigned irq, - unsigned short evtchn, - unsigned short virq) -{ - struct irq_info *info = info_for_irq(irq); - - xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0); - - info->u.virq = virq; - - per_cpu(virq_to_irq, cpu)[virq] = irq; -} - -static void xen_irq_info_pirq_init(unsigned irq, - unsigned short evtchn, - unsigned short pirq, - unsigned short gsi, - uint16_t domid, - unsigned char flags) -{ - struct irq_info *info = info_for_irq(irq); - - xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0); - - info->u.pirq.pirq = pirq; - info->u.pirq.gsi = gsi; - info->u.pirq.domid = domid; - info->u.pirq.flags = flags; -} - -/* - * Accessors for packed IRQ information. - */ -static unsigned int evtchn_from_irq(unsigned irq) -{ - if (unlikely(WARN(irq < 0 || irq >= nr_irqs, "Invalid irq %d!\n", irq))) - return 0; - - return info_for_irq(irq)->evtchn; -} - -unsigned irq_from_evtchn(unsigned int evtchn) -{ - return evtchn_to_irq[evtchn]; -} -EXPORT_SYMBOL_GPL(irq_from_evtchn); - -static enum ipi_vector ipi_from_irq(unsigned irq) -{ - struct irq_info *info = info_for_irq(irq); - - BUG_ON(info == NULL); - BUG_ON(info->type != IRQT_IPI); - - return info->u.ipi; -} - -static unsigned virq_from_irq(unsigned irq) -{ - struct irq_info *info = info_for_irq(irq); - - BUG_ON(info == NULL); - BUG_ON(info->type != IRQT_VIRQ); - - return info->u.virq; -} - -static unsigned pirq_from_irq(unsigned irq) -{ - struct irq_info *info = info_for_irq(irq); - - BUG_ON(info == NULL); - BUG_ON(info->type != IRQT_PIRQ); - - return info->u.pirq.pirq; -} - -static enum xen_irq_type type_from_irq(unsigned irq) -{ - return info_for_irq(irq)->type; -} - -static unsigned cpu_from_irq(unsigned irq) -{ - return info_for_irq(irq)->cpu; -} - -static unsigned int cpu_from_evtchn(unsigned int evtchn) -{ - int irq = evtchn_to_irq[evtchn]; - unsigned ret = 0; - - if (irq != -1) - ret = cpu_from_irq(irq); - - return ret; -} - -#ifdef CONFIG_X86 -static bool pirq_check_eoi_map(unsigned irq) -{ - return test_bit(pirq_from_irq(irq), pirq_eoi_map); -} -#endif - -static bool pirq_needs_eoi_flag(unsigned irq) -{ - struct irq_info *info = info_for_irq(irq); - BUG_ON(info->type != IRQT_PIRQ); - - return info->u.pirq.flags & PIRQ_NEEDS_EOI; -} - -static inline xen_ulong_t active_evtchns(unsigned int cpu, - struct shared_info *sh, - unsigned int idx) -{ - return sh->evtchn_pending[idx] & - per_cpu(cpu_evtchn_mask, cpu)[idx] & - ~sh->evtchn_mask[idx]; -} - -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) -{ - int irq = evtchn_to_irq[chn]; - - BUG_ON(irq == -1); -#ifdef CONFIG_SMP - cpumask_copy(irq_to_desc(irq)->irq_data.affinity, cpumask_of(cpu)); -#endif - - clear_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu_from_irq(irq)))); - set_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu))); - - info_for_irq(irq)->cpu = cpu; -} - -static inline void clear_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline void set_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline int test_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline int test_and_set_mask(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); -} - - -/** - * notify_remote_via_irq - send event to remote end of event channel via irq - * @irq: irq of event channel to send event to - * - * Unlike notify_remote_via_evtchn(), this is safe to use across - * save/restore. Notifications on a broken connection are silently - * dropped. - */ -void notify_remote_via_irq(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - notify_remote_via_evtchn(evtchn); -} -EXPORT_SYMBOL_GPL(notify_remote_via_irq); - -static void mask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, BM(&s->evtchn_mask[0])); -} - -static void unmask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - unsigned int cpu = get_cpu(); - int do_hypercall = 0, evtchn_pending = 0; - - BUG_ON(!irqs_disabled()); - - if (unlikely((cpu != cpu_from_evtchn(port)))) - do_hypercall = 1; - else { - /* - * Need to clear the mask before checking pending to - * avoid a race with an event becoming pending. - * - * EVTCHNOP_unmask will only trigger an upcall if the - * mask bit was set, so if a hypercall is needed - * remask the event. - */ - sync_clear_bit(port, BM(&s->evtchn_mask[0])); - evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0])); - - if (unlikely(evtchn_pending && xen_hvm_domain())) { - sync_set_bit(port, BM(&s->evtchn_mask[0])); - do_hypercall = 1; - } - } - - /* Slow path (hypercall) if this is a non-local port or if this is - * an hvm domain and an event is pending (hvm domains don't have - * their own implementation of irq_enable). */ - if (do_hypercall) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - } else { - struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); - - /* - * The following is basically the equivalent of - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose - * the interrupt edge' if the channel is masked. - */ - if (evtchn_pending && - !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD, - BM(&vcpu_info->evtchn_pending_sel))) - vcpu_info->evtchn_upcall_pending = 1; - } - - put_cpu(); -} - -static void xen_irq_init(unsigned irq) -{ - struct irq_info *info; -#ifdef CONFIG_SMP - struct irq_desc *desc = irq_to_desc(irq); - - /* By default all event channels notify CPU#0. */ - cpumask_copy(desc->irq_data.affinity, cpumask_of(0)); -#endif - - info = kzalloc(sizeof(*info), GFP_KERNEL); - if (info == NULL) - panic("Unable to allocate metadata for IRQ%d\n", irq); - - info->type = IRQT_UNBOUND; - info->refcnt = -1; - - irq_set_handler_data(irq, info); - - list_add_tail(&info->list, &xen_irq_list_head); -} - -static int __must_check xen_allocate_irq_dynamic(void) -{ - int first = 0; - int irq; - -#ifdef CONFIG_X86_IO_APIC - /* - * For an HVM guest or domain 0 which see "real" (emulated or - * actual respectively) GSIs we allocate dynamic IRQs - * e.g. those corresponding to event channels or MSIs - * etc. from the range above those "real" GSIs to avoid - * collisions. - */ - if (xen_initial_domain() || xen_hvm_domain()) - first = get_nr_irqs_gsi(); -#endif - - irq = irq_alloc_desc_from(first, -1); - - if (irq >= 0) - xen_irq_init(irq); - - return irq; -} - -static int __must_check xen_allocate_irq_gsi(unsigned gsi) -{ - int irq; - - /* - * A PV guest has no concept of a GSI (since it has no ACPI - * nor access to/knowledge of the physical APICs). Therefore - * all IRQs are dynamically allocated from the entire IRQ - * space. - */ - if (xen_pv_domain() && !xen_initial_domain()) - return xen_allocate_irq_dynamic(); - - /* Legacy IRQ descriptors are already allocated by the arch. */ - if (gsi < NR_IRQS_LEGACY) - irq = gsi; - else - irq = irq_alloc_desc_at(gsi, -1); - - xen_irq_init(irq); - - return irq; -} - -static void xen_free_irq(unsigned irq) -{ - struct irq_info *info = irq_get_handler_data(irq); - - if (WARN_ON(!info)) - return; - - list_del(&info->list); - - irq_set_handler_data(irq, NULL); - - WARN_ON(info->refcnt > 0); - - kfree(info); - - /* Legacy IRQ descriptors are managed by the arch. */ - if (irq < NR_IRQS_LEGACY) - return; - - irq_free_desc(irq); -} - -static void pirq_query_unmask(int irq) -{ - struct physdev_irq_status_query irq_status; - struct irq_info *info = info_for_irq(irq); - - BUG_ON(info->type != IRQT_PIRQ); - - irq_status.irq = pirq_from_irq(irq); - if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) - irq_status.flags = 0; - - info->u.pirq.flags &= ~PIRQ_NEEDS_EOI; - if (irq_status.flags & XENIRQSTAT_needs_eoi) - info->u.pirq.flags |= PIRQ_NEEDS_EOI; -} - -static bool probing_irq(int irq) -{ - struct irq_desc *desc = irq_to_desc(irq); - - return desc && desc->action == NULL; -} - -static void eoi_pirq(struct irq_data *data) -{ - int evtchn = evtchn_from_irq(data->irq); - struct physdev_eoi eoi = { .irq = pirq_from_irq(data->irq) }; - int rc = 0; - - irq_move_irq(data); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); - - if (pirq_needs_eoi(data->irq)) { - rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi); - WARN_ON(rc); - } -} - -static void mask_ack_pirq(struct irq_data *data) -{ - disable_dynirq(data); - eoi_pirq(data); -} - -static unsigned int __startup_pirq(unsigned int irq) -{ - struct evtchn_bind_pirq bind_pirq; - struct irq_info *info = info_for_irq(irq); - int evtchn = evtchn_from_irq(irq); - int rc; - - BUG_ON(info->type != IRQT_PIRQ); - - if (VALID_EVTCHN(evtchn)) - goto out; - - bind_pirq.pirq = pirq_from_irq(irq); - /* NB. We are happy to share unless we are probing. */ - bind_pirq.flags = info->u.pirq.flags & PIRQ_SHAREABLE ? - BIND_PIRQ__WILL_SHARE : 0; - rc = HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq); - if (rc != 0) { - if (!probing_irq(irq)) - pr_info("Failed to obtain physical IRQ %d\n", irq); - return 0; - } - evtchn = bind_pirq.port; - - pirq_query_unmask(irq); - - evtchn_to_irq[evtchn] = irq; - bind_evtchn_to_cpu(evtchn, 0); - info->evtchn = evtchn; - -out: - unmask_evtchn(evtchn); - eoi_pirq(irq_get_irq_data(irq)); - - return 0; -} - -static unsigned int startup_pirq(struct irq_data *data) -{ - return __startup_pirq(data->irq); -} - -static void shutdown_pirq(struct irq_data *data) -{ - struct evtchn_close close; - unsigned int irq = data->irq; - struct irq_info *info = info_for_irq(irq); - int evtchn = evtchn_from_irq(irq); - - BUG_ON(info->type != IRQT_PIRQ); - - if (!VALID_EVTCHN(evtchn)) - return; - - mask_evtchn(evtchn); - - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - bind_evtchn_to_cpu(evtchn, 0); - evtchn_to_irq[evtchn] = -1; - info->evtchn = 0; -} - -static void enable_pirq(struct irq_data *data) -{ - startup_pirq(data); -} - -static void disable_pirq(struct irq_data *data) -{ - disable_dynirq(data); -} - -int xen_irq_from_gsi(unsigned gsi) -{ - struct irq_info *info; - - list_for_each_entry(info, &xen_irq_list_head, list) { - if (info->type != IRQT_PIRQ) - continue; - - if (info->u.pirq.gsi == gsi) - return info->irq; - } - - return -1; -} -EXPORT_SYMBOL_GPL(xen_irq_from_gsi); - -/* - * Do not make any assumptions regarding the relationship between the - * IRQ number returned here and the Xen pirq argument. - * - * Note: We don't assign an event channel until the irq actually started - * up. Return an existing irq if we've already got one for the gsi. - * - * Shareable implies level triggered, not shareable implies edge - * triggered here. - */ -int xen_bind_pirq_gsi_to_irq(unsigned gsi, - unsigned pirq, int shareable, char *name) -{ - int irq = -1; - struct physdev_irq irq_op; - - mutex_lock(&irq_mapping_update_lock); - - irq = xen_irq_from_gsi(gsi); - if (irq != -1) { - pr_info("%s: returning irq %d for gsi %u\n", - __func__, irq, gsi); - goto out; - } - - irq = xen_allocate_irq_gsi(gsi); - if (irq < 0) - goto out; - - irq_op.irq = irq; - irq_op.vector = 0; - - /* Only the privileged domain can do this. For non-priv, the pcifront - * driver provides a PCI bus that does the call to do exactly - * this in the priv domain. */ - if (xen_initial_domain() && - HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) { - xen_free_irq(irq); - irq = -ENOSPC; - goto out; - } - - xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF, - shareable ? PIRQ_SHAREABLE : 0); - - pirq_query_unmask(irq); - /* We try to use the handler with the appropriate semantic for the - * type of interrupt: if the interrupt is an edge triggered - * interrupt we use handle_edge_irq. - * - * On the other hand if the interrupt is level triggered we use - * handle_fasteoi_irq like the native code does for this kind of - * interrupts. - * - * Depending on the Xen version, pirq_needs_eoi might return true - * not only for level triggered interrupts but for edge triggered - * interrupts too. In any case Xen always honors the eoi mechanism, - * not injecting any more pirqs of the same kind if the first one - * hasn't received an eoi yet. Therefore using the fasteoi handler - * is the right choice either way. - */ - if (shareable) - irq_set_chip_and_handler_name(irq, &xen_pirq_chip, - handle_fasteoi_irq, name); - else - irq_set_chip_and_handler_name(irq, &xen_pirq_chip, - handle_edge_irq, name); - -out: - mutex_unlock(&irq_mapping_update_lock); - - return irq; -} - -#ifdef CONFIG_PCI_MSI -int xen_allocate_pirq_msi(struct pci_dev *dev, struct msi_desc *msidesc) -{ - int rc; - struct physdev_get_free_pirq op_get_free_pirq; - - op_get_free_pirq.type = MAP_PIRQ_TYPE_MSI; - rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq); - - WARN_ONCE(rc == -ENOSYS, - "hypervisor does not support the PHYSDEVOP_get_free_pirq interface\n"); - - return rc ? -1 : op_get_free_pirq.pirq; -} - -int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc, - int pirq, const char *name, domid_t domid) -{ - int irq, ret; - - mutex_lock(&irq_mapping_update_lock); - - irq = xen_allocate_irq_dynamic(); - if (irq < 0) - goto out; - - irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, - name); - - xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0); - ret = irq_set_msi_desc(irq, msidesc); - if (ret < 0) - goto error_irq; -out: - mutex_unlock(&irq_mapping_update_lock); - return irq; -error_irq: - mutex_unlock(&irq_mapping_update_lock); - xen_free_irq(irq); - return ret; -} -#endif - -int xen_destroy_irq(int irq) -{ - struct irq_desc *desc; - struct physdev_unmap_pirq unmap_irq; - struct irq_info *info = info_for_irq(irq); - int rc = -ENOENT; - - mutex_lock(&irq_mapping_update_lock); - - desc = irq_to_desc(irq); - if (!desc) - goto out; - - if (xen_initial_domain()) { - unmap_irq.pirq = info->u.pirq.pirq; - unmap_irq.domid = info->u.pirq.domid; - rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq); - /* If another domain quits without making the pci_disable_msix - * call, the Xen hypervisor takes care of freeing the PIRQs - * (free_domain_pirqs). - */ - if ((rc == -ESRCH && info->u.pirq.domid != DOMID_SELF)) - pr_info("domain %d does not have %d anymore\n", - info->u.pirq.domid, info->u.pirq.pirq); - else if (rc) { - pr_warn("unmap irq failed %d\n", rc); - goto out; - } - } - - xen_free_irq(irq); - -out: - mutex_unlock(&irq_mapping_update_lock); - return rc; -} - -int xen_irq_from_pirq(unsigned pirq) -{ - int irq; - - struct irq_info *info; - - mutex_lock(&irq_mapping_update_lock); - - list_for_each_entry(info, &xen_irq_list_head, list) { - if (info->type != IRQT_PIRQ) - continue; - irq = info->irq; - if (info->u.pirq.pirq == pirq) - goto out; - } - irq = -1; -out: - mutex_unlock(&irq_mapping_update_lock); - - return irq; -} - - -int xen_pirq_from_irq(unsigned irq) -{ - return pirq_from_irq(irq); -} -EXPORT_SYMBOL_GPL(xen_pirq_from_irq); -int bind_evtchn_to_irq(unsigned int evtchn) -{ - int irq; - - mutex_lock(&irq_mapping_update_lock); - - irq = evtchn_to_irq[evtchn]; - - if (irq == -1) { - irq = xen_allocate_irq_dynamic(); - if (irq < 0) - goto out; - - irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, - handle_edge_irq, "event"); - - xen_irq_info_evtchn_init(irq, evtchn); - } else { - struct irq_info *info = info_for_irq(irq); - WARN_ON(info == NULL || info->type != IRQT_EVTCHN); - } - -out: - mutex_unlock(&irq_mapping_update_lock); - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); - -static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) -{ - struct evtchn_bind_ipi bind_ipi; - int evtchn, irq; - - mutex_lock(&irq_mapping_update_lock); - - irq = per_cpu(ipi_to_irq, cpu)[ipi]; - - if (irq == -1) { - irq = xen_allocate_irq_dynamic(); - if (irq < 0) - goto out; - - irq_set_chip_and_handler_name(irq, &xen_percpu_chip, - handle_percpu_irq, "ipi"); - - bind_ipi.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, - &bind_ipi) != 0) - BUG(); - evtchn = bind_ipi.port; - - xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); - - bind_evtchn_to_cpu(evtchn, cpu); - } else { - struct irq_info *info = info_for_irq(irq); - WARN_ON(info == NULL || info->type != IRQT_IPI); - } - - out: - mutex_unlock(&irq_mapping_update_lock); - return irq; -} - -static int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, - unsigned int remote_port) -{ - struct evtchn_bind_interdomain bind_interdomain; - int err; - - bind_interdomain.remote_dom = remote_domain; - bind_interdomain.remote_port = remote_port; - - err = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, - &bind_interdomain); - - return err ? : bind_evtchn_to_irq(bind_interdomain.local_port); -} - -static int find_virq(unsigned int virq, unsigned int cpu) -{ - struct evtchn_status status; - int port, rc = -ENOENT; - - memset(&status, 0, sizeof(status)); - for (port = 0; port <= NR_EVENT_CHANNELS; port++) { - status.dom = DOMID_SELF; - status.port = port; - rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); - if (rc < 0) - continue; - if (status.status != EVTCHNSTAT_virq) - continue; - if (status.u.virq == virq && status.vcpu == cpu) { - rc = port; - break; - } - } - return rc; -} - -int bind_virq_to_irq(unsigned int virq, unsigned int cpu) -{ - struct evtchn_bind_virq bind_virq; - int evtchn, irq, ret; - - mutex_lock(&irq_mapping_update_lock); - - irq = per_cpu(virq_to_irq, cpu)[virq]; - - if (irq == -1) { - irq = xen_allocate_irq_dynamic(); - if (irq < 0) - goto out; - - irq_set_chip_and_handler_name(irq, &xen_percpu_chip, - handle_percpu_irq, "virq"); - - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - ret = HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, - &bind_virq); - if (ret == 0) - evtchn = bind_virq.port; - else { - if (ret == -EEXIST) - ret = find_virq(virq, cpu); - BUG_ON(ret < 0); - evtchn = ret; - } - - xen_irq_info_virq_init(cpu, irq, evtchn, virq); - - bind_evtchn_to_cpu(evtchn, cpu); - } else { - struct irq_info *info = info_for_irq(irq); - WARN_ON(info == NULL || info->type != IRQT_VIRQ); - } - -out: - mutex_unlock(&irq_mapping_update_lock); - - return irq; -} - -static void unbind_from_irq(unsigned int irq) -{ - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - struct irq_info *info = irq_get_handler_data(irq); - - if (WARN_ON(!info)) - return; - - mutex_lock(&irq_mapping_update_lock); - - if (info->refcnt > 0) { - info->refcnt--; - if (info->refcnt != 0) - goto done; - } - - if (VALID_EVTCHN(evtchn)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [virq_from_irq(irq)] = -1; - break; - case IRQT_IPI: - per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn)) - [ipi_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - } - - BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); - - xen_free_irq(irq); - - done: - mutex_unlock(&irq_mapping_update_lock); -} - -int bind_evtchn_to_irqhandler(unsigned int evtchn, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, void *dev_id) -{ - int irq, retval; - - irq = bind_evtchn_to_irq(evtchn); - if (irq < 0) - return irq; - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); - -int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, - void *dev_id) -{ - int irq, retval; - - irq = bind_interdomain_evtchn_to_irq(remote_domain, remote_port); - if (irq < 0) - return irq; - - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irqhandler); - -int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, const char *devname, void *dev_id) -{ - int irq, retval; - - irq = bind_virq_to_irq(virq, cpu); - if (irq < 0) - return irq; - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} -EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); - -int bind_ipi_to_irqhandler(enum ipi_vector ipi, - unsigned int cpu, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, - void *dev_id) -{ - int irq, retval; - - irq = bind_ipi_to_irq(ipi, cpu); - if (irq < 0) - return irq; - - irqflags |= IRQF_NO_SUSPEND | IRQF_FORCE_RESUME | IRQF_EARLY_RESUME; - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; -} - -void unbind_from_irqhandler(unsigned int irq, void *dev_id) -{ - struct irq_info *info = irq_get_handler_data(irq); - - if (WARN_ON(!info)) - return; - free_irq(irq, dev_id); - unbind_from_irq(irq); -} -EXPORT_SYMBOL_GPL(unbind_from_irqhandler); - -int evtchn_make_refcounted(unsigned int evtchn) -{ - int irq = evtchn_to_irq[evtchn]; - struct irq_info *info; - - if (irq == -1) - return -ENOENT; - - info = irq_get_handler_data(irq); - - if (!info) - return -ENOENT; - - WARN_ON(info->refcnt != -1); - - info->refcnt = 1; - - return 0; -} -EXPORT_SYMBOL_GPL(evtchn_make_refcounted); - -int evtchn_get(unsigned int evtchn) -{ - int irq; - struct irq_info *info; - int err = -ENOENT; - - if (evtchn >= NR_EVENT_CHANNELS) - return -EINVAL; - - mutex_lock(&irq_mapping_update_lock); - - irq = evtchn_to_irq[evtchn]; - if (irq == -1) - goto done; - - info = irq_get_handler_data(irq); - - if (!info) - goto done; - - err = -EINVAL; - if (info->refcnt <= 0) - goto done; - - info->refcnt++; - err = 0; - done: - mutex_unlock(&irq_mapping_update_lock); - - return err; -} -EXPORT_SYMBOL_GPL(evtchn_get); - -void evtchn_put(unsigned int evtchn) -{ - int irq = evtchn_to_irq[evtchn]; - if (WARN_ON(irq == -1)) - return; - unbind_from_irq(irq); -} -EXPORT_SYMBOL_GPL(evtchn_put); - -void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) -{ - int irq; - -#ifdef CONFIG_X86 - if (unlikely(vector == XEN_NMI_VECTOR)) { - int rc = HYPERVISOR_vcpu_op(VCPUOP_send_nmi, cpu, NULL); - if (rc < 0) - printk(KERN_WARNING "Sending nmi to CPU%d failed (rc:%d)\n", cpu, rc); - return; - } -#endif - irq = per_cpu(ipi_to_irq, cpu)[vector]; - BUG_ON(irq < 0); - notify_remote_via_irq(irq); -} - -irqreturn_t xen_debug_interrupt(int irq, void *dev_id) -{ - struct shared_info *sh = HYPERVISOR_shared_info; - int cpu = smp_processor_id(); - xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); - int i; - unsigned long flags; - static DEFINE_SPINLOCK(debug_lock); - struct vcpu_info *v; - - spin_lock_irqsave(&debug_lock, flags); - - printk("\nvcpu %d\n ", cpu); - - for_each_online_cpu(i) { - int pending; - v = per_cpu(xen_vcpu, i); - pending = (get_irq_regs() && i == cpu) - ? xen_irqs_disabled(get_irq_regs()) - : v->evtchn_upcall_mask; - printk("%d: masked=%d pending=%d event_sel %0*"PRI_xen_ulong"\n ", i, - pending, v->evtchn_upcall_pending, - (int)(sizeof(v->evtchn_pending_sel)*2), - v->evtchn_pending_sel); - } - v = per_cpu(xen_vcpu, cpu); - - printk("\npending:\n "); - for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)sizeof(sh->evtchn_pending[0])*2, - sh->evtchn_pending[i], - i % 8 == 0 ? "\n " : " "); - printk("\nglobal mask:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_mask[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nglobally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_pending[i] & ~sh->evtchn_mask[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nlocal cpu%d mask:\n ", cpu); - for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", (int)(sizeof(cpu_evtchn[0])*2), - cpu_evtchn[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nlocally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { - xen_ulong_t pending = sh->evtchn_pending[i] - & ~sh->evtchn_mask[i] - & cpu_evtchn[i]; - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - pending, i % 8 == 0 ? "\n " : " "); - } - - printk("\npending list:\n"); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { - if (sync_test_bit(i, BM(sh->evtchn_pending))) { - int word_idx = i / BITS_PER_EVTCHN_WORD; - printk(" %d: event %d -> irq %d%s%s%s\n", - cpu_from_evtchn(i), i, - evtchn_to_irq[i], - sync_test_bit(word_idx, BM(&v->evtchn_pending_sel)) - ? "" : " l2-clear", - !sync_test_bit(i, BM(sh->evtchn_mask)) - ? "" : " globally-masked", - sync_test_bit(i, BM(cpu_evtchn)) - ? "" : " locally-masked"); - } - } - - spin_unlock_irqrestore(&debug_lock, flags); - - return IRQ_HANDLED; -} - -static DEFINE_PER_CPU(unsigned, xed_nesting_count); -static DEFINE_PER_CPU(unsigned int, current_word_idx); -static DEFINE_PER_CPU(unsigned int, current_bit_idx); - -/* - * Mask out the i least significant bits of w - */ -#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i)) - -/* - * Search the CPUs pending events bitmasks. For each one found, map - * the event number to an irq, and feed it into do_IRQ() for - * handling. - * - * Xen uses a two-level bitmap to speed searching. The first level is - * a bitset of words which contain pending event bits. The second - * level is a bitset of pending events themselves. - */ -static void __xen_evtchn_do_upcall(void) -{ - int start_word_idx, start_bit_idx; - int word_idx, bit_idx; - int i, irq; - int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; - struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); - unsigned count; - - do { - xen_ulong_t pending_words; - xen_ulong_t pending_bits; - struct irq_desc *desc; - - vcpu_info->evtchn_upcall_pending = 0; - - if (__this_cpu_inc_return(xed_nesting_count) - 1) - goto out; - - /* - * Master flag must be cleared /before/ clearing - * selector flag. xchg_xen_ulong must contain an - * appropriate barrier. - */ - if ((irq = per_cpu(virq_to_irq, cpu)[VIRQ_TIMER]) != -1) { - int evtchn = evtchn_from_irq(irq); - word_idx = evtchn / BITS_PER_LONG; - pending_bits = evtchn % BITS_PER_LONG; - if (active_evtchns(cpu, s, word_idx) & (1ULL << pending_bits)) { - desc = irq_to_desc(irq); - if (desc) - generic_handle_irq_desc(irq, desc); - } - } - - pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0); - - start_word_idx = __this_cpu_read(current_word_idx); - start_bit_idx = __this_cpu_read(current_bit_idx); - - word_idx = start_word_idx; - - for (i = 0; pending_words != 0; i++) { - xen_ulong_t words; - - words = MASK_LSBS(pending_words, word_idx); - - /* - * If we masked out all events, wrap to beginning. - */ - if (words == 0) { - word_idx = 0; - bit_idx = 0; - continue; - } - word_idx = EVTCHN_FIRST_BIT(words); - - pending_bits = active_evtchns(cpu, s, word_idx); - bit_idx = 0; /* usually scan entire word from start */ - /* - * We scan the starting word in two parts. - * - * 1st time: start in the middle, scanning the - * upper bits. - * - * 2nd time: scan the whole word (not just the - * parts skipped in the first pass) -- if an - * event in the previously scanned bits is - * pending again it would just be scanned on - * the next loop anyway. - */ - if (word_idx == start_word_idx) { - if (i == 0) - bit_idx = start_bit_idx; - } - - do { - xen_ulong_t bits; - int port; - - bits = MASK_LSBS(pending_bits, bit_idx); - - /* If we masked out all events, move on. */ - if (bits == 0) - break; - - bit_idx = EVTCHN_FIRST_BIT(bits); - - /* Process port. */ - port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; - irq = evtchn_to_irq[port]; - - if (irq != -1) { - desc = irq_to_desc(irq); - if (desc) - generic_handle_irq_desc(irq, desc); - } - - bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; - - /* Next caller starts at last processed + 1 */ - __this_cpu_write(current_word_idx, - bit_idx ? word_idx : - (word_idx+1) % BITS_PER_EVTCHN_WORD); - __this_cpu_write(current_bit_idx, bit_idx); - } while (bit_idx != 0); - - /* Scan start_l1i twice; all others once. */ - if ((word_idx != start_word_idx) || (i != 0)) - pending_words &= ~(1UL << word_idx); - - word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD; - } - - BUG_ON(!irqs_disabled()); - - count = __this_cpu_read(xed_nesting_count); - __this_cpu_write(xed_nesting_count, 0); - } while (count != 1 || vcpu_info->evtchn_upcall_pending); - -out: - - put_cpu(); -} - -void xen_evtchn_do_upcall(struct pt_regs *regs) -{ - struct pt_regs *old_regs = set_irq_regs(regs); - - irq_enter(); -#ifdef CONFIG_X86 - exit_idle(); -#endif - - __xen_evtchn_do_upcall(); - - irq_exit(); - set_irq_regs(old_regs); -} - -void xen_hvm_evtchn_do_upcall(void) -{ - __xen_evtchn_do_upcall(); -} -EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall); - -/* Rebind a new event channel to an existing irq. */ -void rebind_evtchn_irq(int evtchn, int irq) -{ - struct irq_info *info = info_for_irq(irq); - - if (WARN_ON(!info)) - return; - - /* Make sure the irq is masked, since the new event channel - will also be masked. */ - disable_irq(irq); - - mutex_lock(&irq_mapping_update_lock); - - /* After resume the irq<->evtchn mappings are all cleared out */ - BUG_ON(evtchn_to_irq[evtchn] != -1); - /* Expect irq to have been bound before, - so there should be a proper type */ - BUG_ON(info->type == IRQT_UNBOUND); - - xen_irq_info_evtchn_init(irq, evtchn); - - mutex_unlock(&irq_mapping_update_lock); - - /* new event channels are always bound to cpu 0 */ - irq_set_affinity(irq, cpumask_of(0)); - - /* Unmask the event channel. */ - enable_irq(irq); -} - -/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu) -{ - struct evtchn_bind_vcpu bind_vcpu; - int evtchn = evtchn_from_irq(irq); - int masked; - - if (!VALID_EVTCHN(evtchn)) - return -1; - - /* - * Events delivered via platform PCI interrupts are always - * routed to vcpu 0 and hence cannot be rebound. - */ - if (xen_hvm_domain() && !xen_have_vector_callback) - return -1; - - /* Send future instances of this interrupt to other vcpu. */ - bind_vcpu.port = evtchn; - bind_vcpu.vcpu = tcpu; - - /* - * Mask the event while changing the VCPU binding to prevent - * it being delivered on an unexpected VCPU. - */ - masked = test_and_set_mask(evtchn); - - /* - * If this fails, it usually just indicates that we're dealing with a - * virq or IPI channel, which don't actually need to be rebound. Ignore - * it, but don't do the xenlinux-level rebind in that case. - */ - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) - bind_evtchn_to_cpu(evtchn, tcpu); - - if (!masked) - unmask_evtchn(evtchn); - - return 0; -} - -static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, - bool force) -{ - unsigned tcpu = cpumask_first(dest); - - return rebind_irq_to_cpu(data->irq, tcpu); -} - -static int retrigger_evtchn(int evtchn) -{ - int masked; - - if (!VALID_EVTCHN(evtchn)) - return 0; - - masked = test_and_set_mask(evtchn); - set_evtchn(evtchn); - if (!masked) - unmask_evtchn(evtchn); - - return 1; -} - -int resend_irq_on_evtchn(unsigned int irq) -{ - return retrigger_evtchn(evtchn_from_irq(irq)); -} - -static void enable_dynirq(struct irq_data *data) -{ - int evtchn = evtchn_from_irq(data->irq); - - if (VALID_EVTCHN(evtchn)) - unmask_evtchn(evtchn); -} - -static void disable_dynirq(struct irq_data *data) -{ - int evtchn = evtchn_from_irq(data->irq); - - if (VALID_EVTCHN(evtchn)) - mask_evtchn(evtchn); -} - -static void ack_dynirq(struct irq_data *data) -{ - int evtchn = evtchn_from_irq(data->irq); - - irq_move_irq(data); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); -} - -static void mask_ack_dynirq(struct irq_data *data) -{ - disable_dynirq(data); - ack_dynirq(data); -} - -static int retrigger_dynirq(struct irq_data *data) -{ - return retrigger_evtchn(evtchn_from_irq(data->irq)); -} - -static void restore_pirqs(void) -{ - int pirq, rc, irq, gsi; - struct physdev_map_pirq map_irq; - struct irq_info *info; - - list_for_each_entry(info, &xen_irq_list_head, list) { - if (info->type != IRQT_PIRQ) - continue; - - pirq = info->u.pirq.pirq; - gsi = info->u.pirq.gsi; - irq = info->irq; - - /* save/restore of PT devices doesn't work, so at this point the - * only devices present are GSI based emulated devices */ - if (!gsi) - continue; - - map_irq.domid = DOMID_SELF; - map_irq.type = MAP_PIRQ_TYPE_GSI; - map_irq.index = gsi; - map_irq.pirq = pirq; - - rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); - if (rc) { - pr_warn("xen map irq failed gsi=%d irq=%d pirq=%d rc=%d\n", - gsi, irq, pirq, rc); - xen_free_irq(irq); - continue; - } - - printk(KERN_DEBUG "xen: --> irq=%d, pirq=%d\n", irq, map_irq.pirq); - - __startup_pirq(irq); - } -} - -static void restore_cpu_virqs(unsigned int cpu) -{ - struct evtchn_bind_virq bind_virq; - int virq, irq, evtchn; - - for (virq = 0; virq < NR_VIRQS; virq++) { - if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1) - continue; - - BUG_ON(virq_from_irq(irq) != virq); - - /* Get a new binding from Xen. */ - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, - &bind_virq) != 0) - BUG(); - evtchn = bind_virq.port; - - /* Record the new mapping. */ - xen_irq_info_virq_init(cpu, irq, evtchn, virq); - bind_evtchn_to_cpu(evtchn, cpu); - } -} - -static void restore_cpu_ipis(unsigned int cpu) -{ - struct evtchn_bind_ipi bind_ipi; - int ipi, irq, evtchn; - - for (ipi = 0; ipi < XEN_NR_IPIS; ipi++) { - if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1) - continue; - - BUG_ON(ipi_from_irq(irq) != ipi); - - /* Get a new binding from Xen. */ - bind_ipi.vcpu = cpu; - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, - &bind_ipi) != 0) - BUG(); - evtchn = bind_ipi.port; - - /* Record the new mapping. */ - xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); - bind_evtchn_to_cpu(evtchn, cpu); - } -} - -/* Clear an irq's pending state, in preparation for polling on it */ -void xen_clear_irq_pending(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); -} -EXPORT_SYMBOL(xen_clear_irq_pending); -void xen_set_irq_pending(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - set_evtchn(evtchn); -} - -bool xen_test_irq_pending(int irq) -{ - int evtchn = evtchn_from_irq(irq); - bool ret = false; - - if (VALID_EVTCHN(evtchn)) - ret = test_evtchn(evtchn); - - return ret; -} - -/* Poll waiting for an irq to become pending with timeout. In the usual case, - * the irq will be disabled so it won't deliver an interrupt. */ -void xen_poll_irq_timeout(int irq, u64 timeout) -{ - evtchn_port_t evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) { - struct sched_poll poll; - - poll.nr_ports = 1; - poll.timeout = timeout; - set_xen_guest_handle(poll.ports, &evtchn); - - if (HYPERVISOR_sched_op(SCHEDOP_poll, &poll) != 0) - BUG(); - } -} -EXPORT_SYMBOL(xen_poll_irq_timeout); -/* Poll waiting for an irq to become pending. In the usual case, the - * irq will be disabled so it won't deliver an interrupt. */ -void xen_poll_irq(int irq) -{ - xen_poll_irq_timeout(irq, 0 /* no timeout */); -} - -/* Check whether the IRQ line is shared with other guests. */ -int xen_test_irq_shared(int irq) -{ - struct irq_info *info = info_for_irq(irq); - struct physdev_irq_status_query irq_status; - - if (WARN_ON(!info)) - return -ENOENT; - - irq_status.irq = info->u.pirq.pirq; - - if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) - return 0; - return !(irq_status.flags & XENIRQSTAT_shared); -} -EXPORT_SYMBOL_GPL(xen_test_irq_shared); - -void xen_irq_resume(void) -{ - unsigned int cpu, evtchn; - struct irq_info *info; - - /* New event-channel space is not 'live' yet. */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) - mask_evtchn(evtchn); - - /* No IRQ <-> event-channel mappings. */ - list_for_each_entry(info, &xen_irq_list_head, list) - info->evtchn = 0; /* zap event-channel binding */ - - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) - evtchn_to_irq[evtchn] = -1; - - for_each_possible_cpu(cpu) { - restore_cpu_virqs(cpu); - restore_cpu_ipis(cpu); - } - - restore_pirqs(); -} - -static struct irq_chip xen_dynamic_chip __read_mostly = { - .name = "xen-dyn", - - .irq_disable = disable_dynirq, - .irq_mask = disable_dynirq, - .irq_unmask = enable_dynirq, - - .irq_ack = ack_dynirq, - .irq_mask_ack = mask_ack_dynirq, - - .irq_set_affinity = set_affinity_irq, - .irq_retrigger = retrigger_dynirq, -}; - -static struct irq_chip xen_pirq_chip __read_mostly = { - .name = "xen-pirq", - - .irq_startup = startup_pirq, - .irq_shutdown = shutdown_pirq, - .irq_enable = enable_pirq, - .irq_disable = disable_pirq, - - .irq_mask = disable_dynirq, - .irq_unmask = enable_dynirq, - - .irq_ack = eoi_pirq, - .irq_eoi = eoi_pirq, - .irq_mask_ack = mask_ack_pirq, - - .irq_set_affinity = set_affinity_irq, - - .irq_retrigger = retrigger_dynirq, -}; - -static struct irq_chip xen_percpu_chip __read_mostly = { - .name = "xen-percpu", - - .irq_disable = disable_dynirq, - .irq_mask = disable_dynirq, - .irq_unmask = enable_dynirq, - - .irq_ack = ack_dynirq, -}; - -int xen_set_callback_via(uint64_t via) -{ - struct xen_hvm_param a; - a.domid = DOMID_SELF; - a.index = HVM_PARAM_CALLBACK_IRQ; - a.value = via; - return HYPERVISOR_hvm_op(HVMOP_set_param, &a); -} -EXPORT_SYMBOL_GPL(xen_set_callback_via); - -#ifdef CONFIG_XEN_PVHVM -/* Vector callbacks are better than PCI interrupts to receive event - * channel notifications because we can receive vector callbacks on any - * vcpu and we don't need PCI support or APIC interactions. */ -void xen_callback_vector(void) -{ - int rc; - uint64_t callback_via; - if (xen_have_vector_callback) { - callback_via = HVM_CALLBACK_VECTOR(HYPERVISOR_CALLBACK_VECTOR); - rc = xen_set_callback_via(callback_via); - if (rc) { - pr_err("Request for Xen HVM callback vector failed\n"); - xen_have_vector_callback = 0; - return; - } - pr_info("Xen HVM callback vector for event delivery is enabled\n"); - /* in the restore case the vector has already been allocated */ - if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors)) - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, - xen_hvm_callback_vector); - } -} -#else -void xen_callback_vector(void) {} -#endif - -void __init xen_init_IRQ(void) -{ - int i; - - evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), - GFP_KERNEL); - BUG_ON(!evtchn_to_irq); - for (i = 0; i < NR_EVENT_CHANNELS; i++) - evtchn_to_irq[i] = -1; - - /* No event channels are 'live' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) - mask_evtchn(i); - - pirq_needs_eoi = pirq_needs_eoi_flag; - -#ifdef CONFIG_X86 - if (xen_hvm_domain()) { - xen_callback_vector(); - native_init_IRQ(); - /* pci_xen_hvm_init must be called after native_init_IRQ so that - * __acpi_register_gsi can point at the right function */ - pci_xen_hvm_init(); - } else { - int rc; - struct physdev_pirq_eoi_gmfn eoi_gmfn; - - irq_ctx_init(smp_processor_id()); - if (xen_initial_domain()) - pci_xen_initial_domain(); - - pirq_eoi_map = (void *)__get_free_page(GFP_KERNEL|__GFP_ZERO); - eoi_gmfn.gmfn = virt_to_mfn(pirq_eoi_map); - rc = HYPERVISOR_physdev_op(PHYSDEVOP_pirq_eoi_gmfn_v2, &eoi_gmfn); - if (rc != 0) { - free_page((unsigned long) pirq_eoi_map); - pirq_eoi_map = NULL; - } else - pirq_needs_eoi = pirq_check_eoi_map; - } -#endif -} diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile new file mode 100644 index 000000000000..f0bc6071fd84 --- /dev/null +++ b/drivers/xen/events/Makefile @@ -0,0 +1,3 @@ +obj-y += events.o + +events-y += events_base.o diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c new file mode 100644 index 000000000000..fec5da4ff3a0 --- /dev/null +++ b/drivers/xen/events/events_base.c @@ -0,0 +1,1908 @@ +/* + * Xen event channels + * + * Xen models interrupts with abstract event channels. Because each + * domain gets 1024 event channels, but NR_IRQ is not that large, we + * must dynamically map irqs<->event channels. The event channels + * interface with the rest of the kernel by defining a xen interrupt + * chip. When an event is received, it is mapped to an irq and sent + * through the normal interrupt processing path. + * + * There are four kinds of events which can be mapped to an event + * channel: + * + * 1. Inter-domain notifications. This includes all the virtual + * device events, since they're driven by front-ends in another domain + * (typically dom0). + * 2. VIRQs, typically used for timers. These are per-cpu events. + * 3. IPIs. + * 4. PIRQs - Hardware interrupts. + * + * Jeremy Fitzhardinge , XenSource Inc, 2007 + */ + +#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_X86 +#include +#include +#include +#include +#include +#include +#include +#endif +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * This lock protects updates to the following mapping and reference-count + * arrays. The lock does not need to be acquired to read the mapping tables. + */ +static DEFINE_MUTEX(irq_mapping_update_lock); + +static LIST_HEAD(xen_irq_list_head); + +/* IRQ <-> VIRQ mapping. */ +static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; + +/* IRQ <-> IPI mapping */ +static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1}; + +/* Interrupt types. */ +enum xen_irq_type { + IRQT_UNBOUND = 0, + IRQT_PIRQ, + IRQT_VIRQ, + IRQT_IPI, + IRQT_EVTCHN +}; + +/* + * Packed IRQ information: + * type - enum xen_irq_type + * event channel - irq->event channel mapping + * cpu - cpu this event channel is bound to + * index - type-specific information: + * PIRQ - physical IRQ, GSI, flags, and owner domain + * VIRQ - virq number + * IPI - IPI vector + * EVTCHN - + */ +struct irq_info { + struct list_head list; + int refcnt; + enum xen_irq_type type; /* type */ + unsigned irq; + unsigned short evtchn; /* event channel */ + unsigned short cpu; /* cpu bound */ + + union { + unsigned short virq; + enum ipi_vector ipi; + struct { + unsigned short pirq; + unsigned short gsi; + unsigned char flags; + uint16_t domid; + } pirq; + } u; +}; +#define PIRQ_NEEDS_EOI (1 << 0) +#define PIRQ_SHAREABLE (1 << 1) + +static int *evtchn_to_irq; +#ifdef CONFIG_X86 +static unsigned long *pirq_eoi_map; +#endif +static bool (*pirq_needs_eoi)(unsigned irq); + +/* + * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be + * careful to only use bitops which allow for this (e.g + * test_bit/find_first_bit and friends but not __ffs) and to pass + * BITS_PER_EVTCHN_WORD as the bitmask length. + */ +#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8) +/* + * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t + * array. Primarily to avoid long lines (hence the terse name). + */ +#define BM(x) (unsigned long *)(x) +/* Find the first set bit in a evtchn mask */ +#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) + +static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], + cpu_evtchn_mask); + +/* Xen will never allocate port zero for any purpose. */ +#define VALID_EVTCHN(chn) ((chn) != 0) + +static struct irq_chip xen_dynamic_chip; +static struct irq_chip xen_percpu_chip; +static struct irq_chip xen_pirq_chip; +static void enable_dynirq(struct irq_data *data); +static void disable_dynirq(struct irq_data *data); + +/* Get info for IRQ */ +static struct irq_info *info_for_irq(unsigned irq) +{ + return irq_get_handler_data(irq); +} + +/* Constructors for packed IRQ information. */ +static void xen_irq_info_common_init(struct irq_info *info, + unsigned irq, + enum xen_irq_type type, + unsigned short evtchn, + unsigned short cpu) +{ + + BUG_ON(info->type != IRQT_UNBOUND && info->type != type); + + info->type = type; + info->irq = irq; + info->evtchn = evtchn; + info->cpu = cpu; + + evtchn_to_irq[evtchn] = irq; + + irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN); +} + +static void xen_irq_info_evtchn_init(unsigned irq, + unsigned short evtchn) +{ + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0); +} + +static void xen_irq_info_ipi_init(unsigned cpu, + unsigned irq, + unsigned short evtchn, + enum ipi_vector ipi) +{ + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0); + + info->u.ipi = ipi; + + per_cpu(ipi_to_irq, cpu)[ipi] = irq; +} + +static void xen_irq_info_virq_init(unsigned cpu, + unsigned irq, + unsigned short evtchn, + unsigned short virq) +{ + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0); + + info->u.virq = virq; + + per_cpu(virq_to_irq, cpu)[virq] = irq; +} + +static void xen_irq_info_pirq_init(unsigned irq, + unsigned short evtchn, + unsigned short pirq, + unsigned short gsi, + uint16_t domid, + unsigned char flags) +{ + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0); + + info->u.pirq.pirq = pirq; + info->u.pirq.gsi = gsi; + info->u.pirq.domid = domid; + info->u.pirq.flags = flags; +} + +/* + * Accessors for packed IRQ information. + */ +static unsigned int evtchn_from_irq(unsigned irq) +{ + if (unlikely(WARN(irq < 0 || irq >= nr_irqs, "Invalid irq %d!\n", irq))) + return 0; + + return info_for_irq(irq)->evtchn; +} + +unsigned irq_from_evtchn(unsigned int evtchn) +{ + return evtchn_to_irq[evtchn]; +} +EXPORT_SYMBOL_GPL(irq_from_evtchn); + +static enum ipi_vector ipi_from_irq(unsigned irq) +{ + struct irq_info *info = info_for_irq(irq); + + BUG_ON(info == NULL); + BUG_ON(info->type != IRQT_IPI); + + return info->u.ipi; +} + +static unsigned virq_from_irq(unsigned irq) +{ + struct irq_info *info = info_for_irq(irq); + + BUG_ON(info == NULL); + BUG_ON(info->type != IRQT_VIRQ); + + return info->u.virq; +} + +static unsigned pirq_from_irq(unsigned irq) +{ + struct irq_info *info = info_for_irq(irq); + + BUG_ON(info == NULL); + BUG_ON(info->type != IRQT_PIRQ); + + return info->u.pirq.pirq; +} + +static enum xen_irq_type type_from_irq(unsigned irq) +{ + return info_for_irq(irq)->type; +} + +static unsigned cpu_from_irq(unsigned irq) +{ + return info_for_irq(irq)->cpu; +} + +static unsigned int cpu_from_evtchn(unsigned int evtchn) +{ + int irq = evtchn_to_irq[evtchn]; + unsigned ret = 0; + + if (irq != -1) + ret = cpu_from_irq(irq); + + return ret; +} + +#ifdef CONFIG_X86 +static bool pirq_check_eoi_map(unsigned irq) +{ + return test_bit(pirq_from_irq(irq), pirq_eoi_map); +} +#endif + +static bool pirq_needs_eoi_flag(unsigned irq) +{ + struct irq_info *info = info_for_irq(irq); + BUG_ON(info->type != IRQT_PIRQ); + + return info->u.pirq.flags & PIRQ_NEEDS_EOI; +} + +static inline xen_ulong_t active_evtchns(unsigned int cpu, + struct shared_info *sh, + unsigned int idx) +{ + return sh->evtchn_pending[idx] & + per_cpu(cpu_evtchn_mask, cpu)[idx] & + ~sh->evtchn_mask[idx]; +} + +static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) +{ + int irq = evtchn_to_irq[chn]; + + BUG_ON(irq == -1); +#ifdef CONFIG_SMP + cpumask_copy(irq_to_desc(irq)->irq_data.affinity, cpumask_of(cpu)); +#endif + + clear_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu_from_irq(irq)))); + set_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu))); + + info_for_irq(irq)->cpu = cpu; +} + +static inline void clear_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_clear_bit(port, BM(&s->evtchn_pending[0])); +} + +static inline void set_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, BM(&s->evtchn_pending[0])); +} + +static inline int test_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_bit(port, BM(&s->evtchn_pending[0])); +} + +static inline int test_and_set_mask(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); +} + + +/** + * notify_remote_via_irq - send event to remote end of event channel via irq + * @irq: irq of event channel to send event to + * + * Unlike notify_remote_via_evtchn(), this is safe to use across + * save/restore. Notifications on a broken connection are silently + * dropped. + */ +void notify_remote_via_irq(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + notify_remote_via_evtchn(evtchn); +} +EXPORT_SYMBOL_GPL(notify_remote_via_irq); + +static void mask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, BM(&s->evtchn_mask[0])); +} + +static void unmask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + unsigned int cpu = get_cpu(); + int do_hypercall = 0, evtchn_pending = 0; + + BUG_ON(!irqs_disabled()); + + if (unlikely((cpu != cpu_from_evtchn(port)))) + do_hypercall = 1; + else { + /* + * Need to clear the mask before checking pending to + * avoid a race with an event becoming pending. + * + * EVTCHNOP_unmask will only trigger an upcall if the + * mask bit was set, so if a hypercall is needed + * remask the event. + */ + sync_clear_bit(port, BM(&s->evtchn_mask[0])); + evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0])); + + if (unlikely(evtchn_pending && xen_hvm_domain())) { + sync_set_bit(port, BM(&s->evtchn_mask[0])); + do_hypercall = 1; + } + } + + /* Slow path (hypercall) if this is a non-local port or if this is + * an hvm domain and an event is pending (hvm domains don't have + * their own implementation of irq_enable). */ + if (do_hypercall) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } else { + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + + /* + * The following is basically the equivalent of + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose + * the interrupt edge' if the channel is masked. + */ + if (evtchn_pending && + !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD, + BM(&vcpu_info->evtchn_pending_sel))) + vcpu_info->evtchn_upcall_pending = 1; + } + + put_cpu(); +} + +static void xen_irq_init(unsigned irq) +{ + struct irq_info *info; +#ifdef CONFIG_SMP + struct irq_desc *desc = irq_to_desc(irq); + + /* By default all event channels notify CPU#0. */ + cpumask_copy(desc->irq_data.affinity, cpumask_of(0)); +#endif + + info = kzalloc(sizeof(*info), GFP_KERNEL); + if (info == NULL) + panic("Unable to allocate metadata for IRQ%d\n", irq); + + info->type = IRQT_UNBOUND; + info->refcnt = -1; + + irq_set_handler_data(irq, info); + + list_add_tail(&info->list, &xen_irq_list_head); +} + +static int __must_check xen_allocate_irq_dynamic(void) +{ + int first = 0; + int irq; + +#ifdef CONFIG_X86_IO_APIC + /* + * For an HVM guest or domain 0 which see "real" (emulated or + * actual respectively) GSIs we allocate dynamic IRQs + * e.g. those corresponding to event channels or MSIs + * etc. from the range above those "real" GSIs to avoid + * collisions. + */ + if (xen_initial_domain() || xen_hvm_domain()) + first = get_nr_irqs_gsi(); +#endif + + irq = irq_alloc_desc_from(first, -1); + + if (irq >= 0) + xen_irq_init(irq); + + return irq; +} + +static int __must_check xen_allocate_irq_gsi(unsigned gsi) +{ + int irq; + + /* + * A PV guest has no concept of a GSI (since it has no ACPI + * nor access to/knowledge of the physical APICs). Therefore + * all IRQs are dynamically allocated from the entire IRQ + * space. + */ + if (xen_pv_domain() && !xen_initial_domain()) + return xen_allocate_irq_dynamic(); + + /* Legacy IRQ descriptors are already allocated by the arch. */ + if (gsi < NR_IRQS_LEGACY) + irq = gsi; + else + irq = irq_alloc_desc_at(gsi, -1); + + xen_irq_init(irq); + + return irq; +} + +static void xen_free_irq(unsigned irq) +{ + struct irq_info *info = irq_get_handler_data(irq); + + if (WARN_ON(!info)) + return; + + list_del(&info->list); + + irq_set_handler_data(irq, NULL); + + WARN_ON(info->refcnt > 0); + + kfree(info); + + /* Legacy IRQ descriptors are managed by the arch. */ + if (irq < NR_IRQS_LEGACY) + return; + + irq_free_desc(irq); +} + +static void pirq_query_unmask(int irq) +{ + struct physdev_irq_status_query irq_status; + struct irq_info *info = info_for_irq(irq); + + BUG_ON(info->type != IRQT_PIRQ); + + irq_status.irq = pirq_from_irq(irq); + if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) + irq_status.flags = 0; + + info->u.pirq.flags &= ~PIRQ_NEEDS_EOI; + if (irq_status.flags & XENIRQSTAT_needs_eoi) + info->u.pirq.flags |= PIRQ_NEEDS_EOI; +} + +static bool probing_irq(int irq) +{ + struct irq_desc *desc = irq_to_desc(irq); + + return desc && desc->action == NULL; +} + +static void eoi_pirq(struct irq_data *data) +{ + int evtchn = evtchn_from_irq(data->irq); + struct physdev_eoi eoi = { .irq = pirq_from_irq(data->irq) }; + int rc = 0; + + irq_move_irq(data); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); + + if (pirq_needs_eoi(data->irq)) { + rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi); + WARN_ON(rc); + } +} + +static void mask_ack_pirq(struct irq_data *data) +{ + disable_dynirq(data); + eoi_pirq(data); +} + +static unsigned int __startup_pirq(unsigned int irq) +{ + struct evtchn_bind_pirq bind_pirq; + struct irq_info *info = info_for_irq(irq); + int evtchn = evtchn_from_irq(irq); + int rc; + + BUG_ON(info->type != IRQT_PIRQ); + + if (VALID_EVTCHN(evtchn)) + goto out; + + bind_pirq.pirq = pirq_from_irq(irq); + /* NB. We are happy to share unless we are probing. */ + bind_pirq.flags = info->u.pirq.flags & PIRQ_SHAREABLE ? + BIND_PIRQ__WILL_SHARE : 0; + rc = HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq); + if (rc != 0) { + if (!probing_irq(irq)) + pr_info("Failed to obtain physical IRQ %d\n", irq); + return 0; + } + evtchn = bind_pirq.port; + + pirq_query_unmask(irq); + + evtchn_to_irq[evtchn] = irq; + bind_evtchn_to_cpu(evtchn, 0); + info->evtchn = evtchn; + +out: + unmask_evtchn(evtchn); + eoi_pirq(irq_get_irq_data(irq)); + + return 0; +} + +static unsigned int startup_pirq(struct irq_data *data) +{ + return __startup_pirq(data->irq); +} + +static void shutdown_pirq(struct irq_data *data) +{ + struct evtchn_close close; + unsigned int irq = data->irq; + struct irq_info *info = info_for_irq(irq); + int evtchn = evtchn_from_irq(irq); + + BUG_ON(info->type != IRQT_PIRQ); + + if (!VALID_EVTCHN(evtchn)) + return; + + mask_evtchn(evtchn); + + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + bind_evtchn_to_cpu(evtchn, 0); + evtchn_to_irq[evtchn] = -1; + info->evtchn = 0; +} + +static void enable_pirq(struct irq_data *data) +{ + startup_pirq(data); +} + +static void disable_pirq(struct irq_data *data) +{ + disable_dynirq(data); +} + +int xen_irq_from_gsi(unsigned gsi) +{ + struct irq_info *info; + + list_for_each_entry(info, &xen_irq_list_head, list) { + if (info->type != IRQT_PIRQ) + continue; + + if (info->u.pirq.gsi == gsi) + return info->irq; + } + + return -1; +} +EXPORT_SYMBOL_GPL(xen_irq_from_gsi); + +/* + * Do not make any assumptions regarding the relationship between the + * IRQ number returned here and the Xen pirq argument. + * + * Note: We don't assign an event channel until the irq actually started + * up. Return an existing irq if we've already got one for the gsi. + * + * Shareable implies level triggered, not shareable implies edge + * triggered here. + */ +int xen_bind_pirq_gsi_to_irq(unsigned gsi, + unsigned pirq, int shareable, char *name) +{ + int irq = -1; + struct physdev_irq irq_op; + + mutex_lock(&irq_mapping_update_lock); + + irq = xen_irq_from_gsi(gsi); + if (irq != -1) { + pr_info("%s: returning irq %d for gsi %u\n", + __func__, irq, gsi); + goto out; + } + + irq = xen_allocate_irq_gsi(gsi); + if (irq < 0) + goto out; + + irq_op.irq = irq; + irq_op.vector = 0; + + /* Only the privileged domain can do this. For non-priv, the pcifront + * driver provides a PCI bus that does the call to do exactly + * this in the priv domain. */ + if (xen_initial_domain() && + HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) { + xen_free_irq(irq); + irq = -ENOSPC; + goto out; + } + + xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF, + shareable ? PIRQ_SHAREABLE : 0); + + pirq_query_unmask(irq); + /* We try to use the handler with the appropriate semantic for the + * type of interrupt: if the interrupt is an edge triggered + * interrupt we use handle_edge_irq. + * + * On the other hand if the interrupt is level triggered we use + * handle_fasteoi_irq like the native code does for this kind of + * interrupts. + * + * Depending on the Xen version, pirq_needs_eoi might return true + * not only for level triggered interrupts but for edge triggered + * interrupts too. In any case Xen always honors the eoi mechanism, + * not injecting any more pirqs of the same kind if the first one + * hasn't received an eoi yet. Therefore using the fasteoi handler + * is the right choice either way. + */ + if (shareable) + irq_set_chip_and_handler_name(irq, &xen_pirq_chip, + handle_fasteoi_irq, name); + else + irq_set_chip_and_handler_name(irq, &xen_pirq_chip, + handle_edge_irq, name); + +out: + mutex_unlock(&irq_mapping_update_lock); + + return irq; +} + +#ifdef CONFIG_PCI_MSI +int xen_allocate_pirq_msi(struct pci_dev *dev, struct msi_desc *msidesc) +{ + int rc; + struct physdev_get_free_pirq op_get_free_pirq; + + op_get_free_pirq.type = MAP_PIRQ_TYPE_MSI; + rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq); + + WARN_ONCE(rc == -ENOSYS, + "hypervisor does not support the PHYSDEVOP_get_free_pirq interface\n"); + + return rc ? -1 : op_get_free_pirq.pirq; +} + +int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc, + int pirq, const char *name, domid_t domid) +{ + int irq, ret; + + mutex_lock(&irq_mapping_update_lock); + + irq = xen_allocate_irq_dynamic(); + if (irq < 0) + goto out; + + irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, + name); + + xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0); + ret = irq_set_msi_desc(irq, msidesc); + if (ret < 0) + goto error_irq; +out: + mutex_unlock(&irq_mapping_update_lock); + return irq; +error_irq: + mutex_unlock(&irq_mapping_update_lock); + xen_free_irq(irq); + return ret; +} +#endif + +int xen_destroy_irq(int irq) +{ + struct irq_desc *desc; + struct physdev_unmap_pirq unmap_irq; + struct irq_info *info = info_for_irq(irq); + int rc = -ENOENT; + + mutex_lock(&irq_mapping_update_lock); + + desc = irq_to_desc(irq); + if (!desc) + goto out; + + if (xen_initial_domain()) { + unmap_irq.pirq = info->u.pirq.pirq; + unmap_irq.domid = info->u.pirq.domid; + rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq); + /* If another domain quits without making the pci_disable_msix + * call, the Xen hypervisor takes care of freeing the PIRQs + * (free_domain_pirqs). + */ + if ((rc == -ESRCH && info->u.pirq.domid != DOMID_SELF)) + pr_info("domain %d does not have %d anymore\n", + info->u.pirq.domid, info->u.pirq.pirq); + else if (rc) { + pr_warn("unmap irq failed %d\n", rc); + goto out; + } + } + + xen_free_irq(irq); + +out: + mutex_unlock(&irq_mapping_update_lock); + return rc; +} + +int xen_irq_from_pirq(unsigned pirq) +{ + int irq; + + struct irq_info *info; + + mutex_lock(&irq_mapping_update_lock); + + list_for_each_entry(info, &xen_irq_list_head, list) { + if (info->type != IRQT_PIRQ) + continue; + irq = info->irq; + if (info->u.pirq.pirq == pirq) + goto out; + } + irq = -1; +out: + mutex_unlock(&irq_mapping_update_lock); + + return irq; +} + + +int xen_pirq_from_irq(unsigned irq) +{ + return pirq_from_irq(irq); +} +EXPORT_SYMBOL_GPL(xen_pirq_from_irq); +int bind_evtchn_to_irq(unsigned int evtchn) +{ + int irq; + + mutex_lock(&irq_mapping_update_lock); + + irq = evtchn_to_irq[evtchn]; + + if (irq == -1) { + irq = xen_allocate_irq_dynamic(); + if (irq < 0) + goto out; + + irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, + handle_edge_irq, "event"); + + xen_irq_info_evtchn_init(irq, evtchn); + } else { + struct irq_info *info = info_for_irq(irq); + WARN_ON(info == NULL || info->type != IRQT_EVTCHN); + } + +out: + mutex_unlock(&irq_mapping_update_lock); + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); + +static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) +{ + struct evtchn_bind_ipi bind_ipi; + int evtchn, irq; + + mutex_lock(&irq_mapping_update_lock); + + irq = per_cpu(ipi_to_irq, cpu)[ipi]; + + if (irq == -1) { + irq = xen_allocate_irq_dynamic(); + if (irq < 0) + goto out; + + irq_set_chip_and_handler_name(irq, &xen_percpu_chip, + handle_percpu_irq, "ipi"); + + bind_ipi.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, + &bind_ipi) != 0) + BUG(); + evtchn = bind_ipi.port; + + xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); + + bind_evtchn_to_cpu(evtchn, cpu); + } else { + struct irq_info *info = info_for_irq(irq); + WARN_ON(info == NULL || info->type != IRQT_IPI); + } + + out: + mutex_unlock(&irq_mapping_update_lock); + return irq; +} + +static int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, + unsigned int remote_port) +{ + struct evtchn_bind_interdomain bind_interdomain; + int err; + + bind_interdomain.remote_dom = remote_domain; + bind_interdomain.remote_port = remote_port; + + err = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, + &bind_interdomain); + + return err ? : bind_evtchn_to_irq(bind_interdomain.local_port); +} + +static int find_virq(unsigned int virq, unsigned int cpu) +{ + struct evtchn_status status; + int port, rc = -ENOENT; + + memset(&status, 0, sizeof(status)); + for (port = 0; port <= NR_EVENT_CHANNELS; port++) { + status.dom = DOMID_SELF; + status.port = port; + rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); + if (rc < 0) + continue; + if (status.status != EVTCHNSTAT_virq) + continue; + if (status.u.virq == virq && status.vcpu == cpu) { + rc = port; + break; + } + } + return rc; +} + +int bind_virq_to_irq(unsigned int virq, unsigned int cpu) +{ + struct evtchn_bind_virq bind_virq; + int evtchn, irq, ret; + + mutex_lock(&irq_mapping_update_lock); + + irq = per_cpu(virq_to_irq, cpu)[virq]; + + if (irq == -1) { + irq = xen_allocate_irq_dynamic(); + if (irq < 0) + goto out; + + irq_set_chip_and_handler_name(irq, &xen_percpu_chip, + handle_percpu_irq, "virq"); + + bind_virq.virq = virq; + bind_virq.vcpu = cpu; + ret = HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, + &bind_virq); + if (ret == 0) + evtchn = bind_virq.port; + else { + if (ret == -EEXIST) + ret = find_virq(virq, cpu); + BUG_ON(ret < 0); + evtchn = ret; + } + + xen_irq_info_virq_init(cpu, irq, evtchn, virq); + + bind_evtchn_to_cpu(evtchn, cpu); + } else { + struct irq_info *info = info_for_irq(irq); + WARN_ON(info == NULL || info->type != IRQT_VIRQ); + } + +out: + mutex_unlock(&irq_mapping_update_lock); + + return irq; +} + +static void unbind_from_irq(unsigned int irq) +{ + struct evtchn_close close; + int evtchn = evtchn_from_irq(irq); + struct irq_info *info = irq_get_handler_data(irq); + + if (WARN_ON(!info)) + return; + + mutex_lock(&irq_mapping_update_lock); + + if (info->refcnt > 0) { + info->refcnt--; + if (info->refcnt != 0) + goto done; + } + + if (VALID_EVTCHN(evtchn)) { + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + switch (type_from_irq(irq)) { + case IRQT_VIRQ: + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) + [virq_from_irq(irq)] = -1; + break; + case IRQT_IPI: + per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn)) + [ipi_from_irq(irq)] = -1; + break; + default: + break; + } + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(evtchn, 0); + + evtchn_to_irq[evtchn] = -1; + } + + BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); + + xen_free_irq(irq); + + done: + mutex_unlock(&irq_mapping_update_lock); +} + +int bind_evtchn_to_irqhandler(unsigned int evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + int irq, retval; + + irq = bind_evtchn_to_irq(evtchn); + if (irq < 0) + return irq; + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); + +int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, + unsigned int remote_port, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + int irq, retval; + + irq = bind_interdomain_evtchn_to_irq(remote_domain, remote_port); + if (irq < 0) + return irq; + + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irqhandler); + +int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, const char *devname, void *dev_id) +{ + int irq, retval; + + irq = bind_virq_to_irq(virq, cpu); + if (irq < 0) + return irq; + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} +EXPORT_SYMBOL_GPL(bind_virq_to_irqhandler); + +int bind_ipi_to_irqhandler(enum ipi_vector ipi, + unsigned int cpu, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + int irq, retval; + + irq = bind_ipi_to_irq(ipi, cpu); + if (irq < 0) + return irq; + + irqflags |= IRQF_NO_SUSPEND | IRQF_FORCE_RESUME | IRQF_EARLY_RESUME; + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} + +void unbind_from_irqhandler(unsigned int irq, void *dev_id) +{ + struct irq_info *info = irq_get_handler_data(irq); + + if (WARN_ON(!info)) + return; + free_irq(irq, dev_id); + unbind_from_irq(irq); +} +EXPORT_SYMBOL_GPL(unbind_from_irqhandler); + +int evtchn_make_refcounted(unsigned int evtchn) +{ + int irq = evtchn_to_irq[evtchn]; + struct irq_info *info; + + if (irq == -1) + return -ENOENT; + + info = irq_get_handler_data(irq); + + if (!info) + return -ENOENT; + + WARN_ON(info->refcnt != -1); + + info->refcnt = 1; + + return 0; +} +EXPORT_SYMBOL_GPL(evtchn_make_refcounted); + +int evtchn_get(unsigned int evtchn) +{ + int irq; + struct irq_info *info; + int err = -ENOENT; + + if (evtchn >= NR_EVENT_CHANNELS) + return -EINVAL; + + mutex_lock(&irq_mapping_update_lock); + + irq = evtchn_to_irq[evtchn]; + if (irq == -1) + goto done; + + info = irq_get_handler_data(irq); + + if (!info) + goto done; + + err = -EINVAL; + if (info->refcnt <= 0) + goto done; + + info->refcnt++; + err = 0; + done: + mutex_unlock(&irq_mapping_update_lock); + + return err; +} +EXPORT_SYMBOL_GPL(evtchn_get); + +void evtchn_put(unsigned int evtchn) +{ + int irq = evtchn_to_irq[evtchn]; + if (WARN_ON(irq == -1)) + return; + unbind_from_irq(irq); +} +EXPORT_SYMBOL_GPL(evtchn_put); + +void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) +{ + int irq; + +#ifdef CONFIG_X86 + if (unlikely(vector == XEN_NMI_VECTOR)) { + int rc = HYPERVISOR_vcpu_op(VCPUOP_send_nmi, cpu, NULL); + if (rc < 0) + printk(KERN_WARNING "Sending nmi to CPU%d failed (rc:%d)\n", cpu, rc); + return; + } +#endif + irq = per_cpu(ipi_to_irq, cpu)[vector]; + BUG_ON(irq < 0); + notify_remote_via_irq(irq); +} + +irqreturn_t xen_debug_interrupt(int irq, void *dev_id) +{ + struct shared_info *sh = HYPERVISOR_shared_info; + int cpu = smp_processor_id(); + xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); + int i; + unsigned long flags; + static DEFINE_SPINLOCK(debug_lock); + struct vcpu_info *v; + + spin_lock_irqsave(&debug_lock, flags); + + printk("\nvcpu %d\n ", cpu); + + for_each_online_cpu(i) { + int pending; + v = per_cpu(xen_vcpu, i); + pending = (get_irq_regs() && i == cpu) + ? xen_irqs_disabled(get_irq_regs()) + : v->evtchn_upcall_mask; + printk("%d: masked=%d pending=%d event_sel %0*"PRI_xen_ulong"\n ", i, + pending, v->evtchn_upcall_pending, + (int)(sizeof(v->evtchn_pending_sel)*2), + v->evtchn_pending_sel); + } + v = per_cpu(xen_vcpu, cpu); + + printk("\npending:\n "); + for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)sizeof(sh->evtchn_pending[0])*2, + sh->evtchn_pending[i], + i % 8 == 0 ? "\n " : " "); + printk("\nglobal mask:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + sh->evtchn_mask[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nglobally unmasked:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + sh->evtchn_pending[i] & ~sh->evtchn_mask[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nlocal cpu%d mask:\n ", cpu); + for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", (int)(sizeof(cpu_evtchn[0])*2), + cpu_evtchn[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nlocally unmasked:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { + xen_ulong_t pending = sh->evtchn_pending[i] + & ~sh->evtchn_mask[i] + & cpu_evtchn[i]; + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + pending, i % 8 == 0 ? "\n " : " "); + } + + printk("\npending list:\n"); + for (i = 0; i < NR_EVENT_CHANNELS; i++) { + if (sync_test_bit(i, BM(sh->evtchn_pending))) { + int word_idx = i / BITS_PER_EVTCHN_WORD; + printk(" %d: event %d -> irq %d%s%s%s\n", + cpu_from_evtchn(i), i, + evtchn_to_irq[i], + sync_test_bit(word_idx, BM(&v->evtchn_pending_sel)) + ? "" : " l2-clear", + !sync_test_bit(i, BM(sh->evtchn_mask)) + ? "" : " globally-masked", + sync_test_bit(i, BM(cpu_evtchn)) + ? "" : " locally-masked"); + } + } + + spin_unlock_irqrestore(&debug_lock, flags); + + return IRQ_HANDLED; +} + +static DEFINE_PER_CPU(unsigned, xed_nesting_count); +static DEFINE_PER_CPU(unsigned int, current_word_idx); +static DEFINE_PER_CPU(unsigned int, current_bit_idx); + +/* + * Mask out the i least significant bits of w + */ +#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i)) + +/* + * Search the CPUs pending events bitmasks. For each one found, map + * the event number to an irq, and feed it into do_IRQ() for + * handling. + * + * Xen uses a two-level bitmap to speed searching. The first level is + * a bitset of words which contain pending event bits. The second + * level is a bitset of pending events themselves. + */ +static void __xen_evtchn_do_upcall(void) +{ + int start_word_idx, start_bit_idx; + int word_idx, bit_idx; + int i, irq; + int cpu = get_cpu(); + struct shared_info *s = HYPERVISOR_shared_info; + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + unsigned count; + + do { + xen_ulong_t pending_words; + xen_ulong_t pending_bits; + struct irq_desc *desc; + + vcpu_info->evtchn_upcall_pending = 0; + + if (__this_cpu_inc_return(xed_nesting_count) - 1) + goto out; + + /* + * Master flag must be cleared /before/ clearing + * selector flag. xchg_xen_ulong must contain an + * appropriate barrier. + */ + if ((irq = per_cpu(virq_to_irq, cpu)[VIRQ_TIMER]) != -1) { + int evtchn = evtchn_from_irq(irq); + word_idx = evtchn / BITS_PER_LONG; + pending_bits = evtchn % BITS_PER_LONG; + if (active_evtchns(cpu, s, word_idx) & (1ULL << pending_bits)) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } + } + + pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0); + + start_word_idx = __this_cpu_read(current_word_idx); + start_bit_idx = __this_cpu_read(current_bit_idx); + + word_idx = start_word_idx; + + for (i = 0; pending_words != 0; i++) { + xen_ulong_t words; + + words = MASK_LSBS(pending_words, word_idx); + + /* + * If we masked out all events, wrap to beginning. + */ + if (words == 0) { + word_idx = 0; + bit_idx = 0; + continue; + } + word_idx = EVTCHN_FIRST_BIT(words); + + pending_bits = active_evtchns(cpu, s, word_idx); + bit_idx = 0; /* usually scan entire word from start */ + /* + * We scan the starting word in two parts. + * + * 1st time: start in the middle, scanning the + * upper bits. + * + * 2nd time: scan the whole word (not just the + * parts skipped in the first pass) -- if an + * event in the previously scanned bits is + * pending again it would just be scanned on + * the next loop anyway. + */ + if (word_idx == start_word_idx) { + if (i == 0) + bit_idx = start_bit_idx; + } + + do { + xen_ulong_t bits; + int port; + + bits = MASK_LSBS(pending_bits, bit_idx); + + /* If we masked out all events, move on. */ + if (bits == 0) + break; + + bit_idx = EVTCHN_FIRST_BIT(bits); + + /* Process port. */ + port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; + irq = evtchn_to_irq[port]; + + if (irq != -1) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } + + bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; + + /* Next caller starts at last processed + 1 */ + __this_cpu_write(current_word_idx, + bit_idx ? word_idx : + (word_idx+1) % BITS_PER_EVTCHN_WORD); + __this_cpu_write(current_bit_idx, bit_idx); + } while (bit_idx != 0); + + /* Scan start_l1i twice; all others once. */ + if ((word_idx != start_word_idx) || (i != 0)) + pending_words &= ~(1UL << word_idx); + + word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD; + } + + BUG_ON(!irqs_disabled()); + + count = __this_cpu_read(xed_nesting_count); + __this_cpu_write(xed_nesting_count, 0); + } while (count != 1 || vcpu_info->evtchn_upcall_pending); + +out: + + put_cpu(); +} + +void xen_evtchn_do_upcall(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + irq_enter(); +#ifdef CONFIG_X86 + exit_idle(); +#endif + + __xen_evtchn_do_upcall(); + + irq_exit(); + set_irq_regs(old_regs); +} + +void xen_hvm_evtchn_do_upcall(void) +{ + __xen_evtchn_do_upcall(); +} +EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall); + +/* Rebind a new event channel to an existing irq. */ +void rebind_evtchn_irq(int evtchn, int irq) +{ + struct irq_info *info = info_for_irq(irq); + + if (WARN_ON(!info)) + return; + + /* Make sure the irq is masked, since the new event channel + will also be masked. */ + disable_irq(irq); + + mutex_lock(&irq_mapping_update_lock); + + /* After resume the irq<->evtchn mappings are all cleared out */ + BUG_ON(evtchn_to_irq[evtchn] != -1); + /* Expect irq to have been bound before, + so there should be a proper type */ + BUG_ON(info->type == IRQT_UNBOUND); + + xen_irq_info_evtchn_init(irq, evtchn); + + mutex_unlock(&irq_mapping_update_lock); + + /* new event channels are always bound to cpu 0 */ + irq_set_affinity(irq, cpumask_of(0)); + + /* Unmask the event channel. */ + enable_irq(irq); +} + +/* Rebind an evtchn so that it gets delivered to a specific cpu */ +static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu) +{ + struct evtchn_bind_vcpu bind_vcpu; + int evtchn = evtchn_from_irq(irq); + int masked; + + if (!VALID_EVTCHN(evtchn)) + return -1; + + /* + * Events delivered via platform PCI interrupts are always + * routed to vcpu 0 and hence cannot be rebound. + */ + if (xen_hvm_domain() && !xen_have_vector_callback) + return -1; + + /* Send future instances of this interrupt to other vcpu. */ + bind_vcpu.port = evtchn; + bind_vcpu.vcpu = tcpu; + + /* + * Mask the event while changing the VCPU binding to prevent + * it being delivered on an unexpected VCPU. + */ + masked = test_and_set_mask(evtchn); + + /* + * If this fails, it usually just indicates that we're dealing with a + * virq or IPI channel, which don't actually need to be rebound. Ignore + * it, but don't do the xenlinux-level rebind in that case. + */ + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) + bind_evtchn_to_cpu(evtchn, tcpu); + + if (!masked) + unmask_evtchn(evtchn); + + return 0; +} + +static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, + bool force) +{ + unsigned tcpu = cpumask_first(dest); + + return rebind_irq_to_cpu(data->irq, tcpu); +} + +static int retrigger_evtchn(int evtchn) +{ + int masked; + + if (!VALID_EVTCHN(evtchn)) + return 0; + + masked = test_and_set_mask(evtchn); + set_evtchn(evtchn); + if (!masked) + unmask_evtchn(evtchn); + + return 1; +} + +int resend_irq_on_evtchn(unsigned int irq) +{ + return retrigger_evtchn(evtchn_from_irq(irq)); +} + +static void enable_dynirq(struct irq_data *data) +{ + int evtchn = evtchn_from_irq(data->irq); + + if (VALID_EVTCHN(evtchn)) + unmask_evtchn(evtchn); +} + +static void disable_dynirq(struct irq_data *data) +{ + int evtchn = evtchn_from_irq(data->irq); + + if (VALID_EVTCHN(evtchn)) + mask_evtchn(evtchn); +} + +static void ack_dynirq(struct irq_data *data) +{ + int evtchn = evtchn_from_irq(data->irq); + + irq_move_irq(data); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); +} + +static void mask_ack_dynirq(struct irq_data *data) +{ + disable_dynirq(data); + ack_dynirq(data); +} + +static int retrigger_dynirq(struct irq_data *data) +{ + return retrigger_evtchn(evtchn_from_irq(data->irq)); +} + +static void restore_pirqs(void) +{ + int pirq, rc, irq, gsi; + struct physdev_map_pirq map_irq; + struct irq_info *info; + + list_for_each_entry(info, &xen_irq_list_head, list) { + if (info->type != IRQT_PIRQ) + continue; + + pirq = info->u.pirq.pirq; + gsi = info->u.pirq.gsi; + irq = info->irq; + + /* save/restore of PT devices doesn't work, so at this point the + * only devices present are GSI based emulated devices */ + if (!gsi) + continue; + + map_irq.domid = DOMID_SELF; + map_irq.type = MAP_PIRQ_TYPE_GSI; + map_irq.index = gsi; + map_irq.pirq = pirq; + + rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); + if (rc) { + pr_warn("xen map irq failed gsi=%d irq=%d pirq=%d rc=%d\n", + gsi, irq, pirq, rc); + xen_free_irq(irq); + continue; + } + + printk(KERN_DEBUG "xen: --> irq=%d, pirq=%d\n", irq, map_irq.pirq); + + __startup_pirq(irq); + } +} + +static void restore_cpu_virqs(unsigned int cpu) +{ + struct evtchn_bind_virq bind_virq; + int virq, irq, evtchn; + + for (virq = 0; virq < NR_VIRQS; virq++) { + if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1) + continue; + + BUG_ON(virq_from_irq(irq) != virq); + + /* Get a new binding from Xen. */ + bind_virq.virq = virq; + bind_virq.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, + &bind_virq) != 0) + BUG(); + evtchn = bind_virq.port; + + /* Record the new mapping. */ + xen_irq_info_virq_init(cpu, irq, evtchn, virq); + bind_evtchn_to_cpu(evtchn, cpu); + } +} + +static void restore_cpu_ipis(unsigned int cpu) +{ + struct evtchn_bind_ipi bind_ipi; + int ipi, irq, evtchn; + + for (ipi = 0; ipi < XEN_NR_IPIS; ipi++) { + if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1) + continue; + + BUG_ON(ipi_from_irq(irq) != ipi); + + /* Get a new binding from Xen. */ + bind_ipi.vcpu = cpu; + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, + &bind_ipi) != 0) + BUG(); + evtchn = bind_ipi.port; + + /* Record the new mapping. */ + xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); + bind_evtchn_to_cpu(evtchn, cpu); + } +} + +/* Clear an irq's pending state, in preparation for polling on it */ +void xen_clear_irq_pending(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + clear_evtchn(evtchn); +} +EXPORT_SYMBOL(xen_clear_irq_pending); +void xen_set_irq_pending(int irq) +{ + int evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) + set_evtchn(evtchn); +} + +bool xen_test_irq_pending(int irq) +{ + int evtchn = evtchn_from_irq(irq); + bool ret = false; + + if (VALID_EVTCHN(evtchn)) + ret = test_evtchn(evtchn); + + return ret; +} + +/* Poll waiting for an irq to become pending with timeout. In the usual case, + * the irq will be disabled so it won't deliver an interrupt. */ +void xen_poll_irq_timeout(int irq, u64 timeout) +{ + evtchn_port_t evtchn = evtchn_from_irq(irq); + + if (VALID_EVTCHN(evtchn)) { + struct sched_poll poll; + + poll.nr_ports = 1; + poll.timeout = timeout; + set_xen_guest_handle(poll.ports, &evtchn); + + if (HYPERVISOR_sched_op(SCHEDOP_poll, &poll) != 0) + BUG(); + } +} +EXPORT_SYMBOL(xen_poll_irq_timeout); +/* Poll waiting for an irq to become pending. In the usual case, the + * irq will be disabled so it won't deliver an interrupt. */ +void xen_poll_irq(int irq) +{ + xen_poll_irq_timeout(irq, 0 /* no timeout */); +} + +/* Check whether the IRQ line is shared with other guests. */ +int xen_test_irq_shared(int irq) +{ + struct irq_info *info = info_for_irq(irq); + struct physdev_irq_status_query irq_status; + + if (WARN_ON(!info)) + return -ENOENT; + + irq_status.irq = info->u.pirq.pirq; + + if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) + return 0; + return !(irq_status.flags & XENIRQSTAT_shared); +} +EXPORT_SYMBOL_GPL(xen_test_irq_shared); + +void xen_irq_resume(void) +{ + unsigned int cpu, evtchn; + struct irq_info *info; + + /* New event-channel space is not 'live' yet. */ + for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) + mask_evtchn(evtchn); + + /* No IRQ <-> event-channel mappings. */ + list_for_each_entry(info, &xen_irq_list_head, list) + info->evtchn = 0; /* zap event-channel binding */ + + for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) + evtchn_to_irq[evtchn] = -1; + + for_each_possible_cpu(cpu) { + restore_cpu_virqs(cpu); + restore_cpu_ipis(cpu); + } + + restore_pirqs(); +} + +static struct irq_chip xen_dynamic_chip __read_mostly = { + .name = "xen-dyn", + + .irq_disable = disable_dynirq, + .irq_mask = disable_dynirq, + .irq_unmask = enable_dynirq, + + .irq_ack = ack_dynirq, + .irq_mask_ack = mask_ack_dynirq, + + .irq_set_affinity = set_affinity_irq, + .irq_retrigger = retrigger_dynirq, +}; + +static struct irq_chip xen_pirq_chip __read_mostly = { + .name = "xen-pirq", + + .irq_startup = startup_pirq, + .irq_shutdown = shutdown_pirq, + .irq_enable = enable_pirq, + .irq_disable = disable_pirq, + + .irq_mask = disable_dynirq, + .irq_unmask = enable_dynirq, + + .irq_ack = eoi_pirq, + .irq_eoi = eoi_pirq, + .irq_mask_ack = mask_ack_pirq, + + .irq_set_affinity = set_affinity_irq, + + .irq_retrigger = retrigger_dynirq, +}; + +static struct irq_chip xen_percpu_chip __read_mostly = { + .name = "xen-percpu", + + .irq_disable = disable_dynirq, + .irq_mask = disable_dynirq, + .irq_unmask = enable_dynirq, + + .irq_ack = ack_dynirq, +}; + +int xen_set_callback_via(uint64_t via) +{ + struct xen_hvm_param a; + a.domid = DOMID_SELF; + a.index = HVM_PARAM_CALLBACK_IRQ; + a.value = via; + return HYPERVISOR_hvm_op(HVMOP_set_param, &a); +} +EXPORT_SYMBOL_GPL(xen_set_callback_via); + +#ifdef CONFIG_XEN_PVHVM +/* Vector callbacks are better than PCI interrupts to receive event + * channel notifications because we can receive vector callbacks on any + * vcpu and we don't need PCI support or APIC interactions. */ +void xen_callback_vector(void) +{ + int rc; + uint64_t callback_via; + if (xen_have_vector_callback) { + callback_via = HVM_CALLBACK_VECTOR(HYPERVISOR_CALLBACK_VECTOR); + rc = xen_set_callback_via(callback_via); + if (rc) { + pr_err("Request for Xen HVM callback vector failed\n"); + xen_have_vector_callback = 0; + return; + } + pr_info("Xen HVM callback vector for event delivery is enabled\n"); + /* in the restore case the vector has already been allocated */ + if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors)) + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, + xen_hvm_callback_vector); + } +} +#else +void xen_callback_vector(void) {} +#endif + +void __init xen_init_IRQ(void) +{ + int i; + + evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), + GFP_KERNEL); + BUG_ON(!evtchn_to_irq); + for (i = 0; i < NR_EVENT_CHANNELS; i++) + evtchn_to_irq[i] = -1; + + /* No event channels are 'live' right now. */ + for (i = 0; i < NR_EVENT_CHANNELS; i++) + mask_evtchn(i); + + pirq_needs_eoi = pirq_needs_eoi_flag; + +#ifdef CONFIG_X86 + if (xen_hvm_domain()) { + xen_callback_vector(); + native_init_IRQ(); + /* pci_xen_hvm_init must be called after native_init_IRQ so that + * __acpi_register_gsi can point at the right function */ + pci_xen_hvm_init(); + } else { + int rc; + struct physdev_pirq_eoi_gmfn eoi_gmfn; + + irq_ctx_init(smp_processor_id()); + if (xen_initial_domain()) + pci_xen_initial_domain(); + + pirq_eoi_map = (void *)__get_free_page(GFP_KERNEL|__GFP_ZERO); + eoi_gmfn.gmfn = virt_to_mfn(pirq_eoi_map); + rc = HYPERVISOR_physdev_op(PHYSDEVOP_pirq_eoi_gmfn_v2, &eoi_gmfn); + if (rc != 0) { + free_page((unsigned long) pirq_eoi_map); + pirq_eoi_map = NULL; + } else + pirq_needs_eoi = pirq_check_eoi_map; + } +#endif +} -- cgit v1.2.1 From 9a489f45a155fe96b9b55fbbef2b757ef7737cfc Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Wed, 13 Mar 2013 15:29:25 +0000 Subject: xen/events: move 2-level specific code into its own file In preparation for alternative event channel ABIs, move all the functions accessing the shared data structures into their own file. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/Makefile | 1 + drivers/xen/events/events_2l.c | 348 ++++++++++++++++++++++++++++++++ drivers/xen/events/events_base.c | 379 ++--------------------------------- drivers/xen/events/events_internal.h | 74 +++++++ 4 files changed, 440 insertions(+), 362 deletions(-) create mode 100644 drivers/xen/events/events_2l.c create mode 100644 drivers/xen/events/events_internal.h (limited to 'drivers/xen') diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile index f0bc6071fd84..08179fe04612 100644 --- a/drivers/xen/events/Makefile +++ b/drivers/xen/events/Makefile @@ -1,3 +1,4 @@ obj-y += events.o events-y += events_base.o +events-y += events_2l.o diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c new file mode 100644 index 000000000000..a77e98d025fa --- /dev/null +++ b/drivers/xen/events/events_2l.c @@ -0,0 +1,348 @@ +/* + * Xen event channels (2-level ABI) + * + * Jeremy Fitzhardinge , XenSource Inc, 2007 + */ + +#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include + +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "events_internal.h" + +/* + * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be + * careful to only use bitops which allow for this (e.g + * test_bit/find_first_bit and friends but not __ffs) and to pass + * BITS_PER_EVTCHN_WORD as the bitmask length. + */ +#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8) +/* + * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t + * array. Primarily to avoid long lines (hence the terse name). + */ +#define BM(x) (unsigned long *)(x) +/* Find the first set bit in a evtchn mask */ +#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) + +static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], + cpu_evtchn_mask); + +void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu) +{ + clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu))); + set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu))); +} + +void clear_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_clear_bit(port, BM(&s->evtchn_pending[0])); +} + +void set_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, BM(&s->evtchn_pending[0])); +} + +int test_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_bit(port, BM(&s->evtchn_pending[0])); +} + +int test_and_set_mask(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); +} + +void mask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + sync_set_bit(port, BM(&s->evtchn_mask[0])); +} + +void unmask_evtchn(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + unsigned int cpu = get_cpu(); + int do_hypercall = 0, evtchn_pending = 0; + + BUG_ON(!irqs_disabled()); + + if (unlikely((cpu != cpu_from_evtchn(port)))) + do_hypercall = 1; + else { + /* + * Need to clear the mask before checking pending to + * avoid a race with an event becoming pending. + * + * EVTCHNOP_unmask will only trigger an upcall if the + * mask bit was set, so if a hypercall is needed + * remask the event. + */ + sync_clear_bit(port, BM(&s->evtchn_mask[0])); + evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0])); + + if (unlikely(evtchn_pending && xen_hvm_domain())) { + sync_set_bit(port, BM(&s->evtchn_mask[0])); + do_hypercall = 1; + } + } + + /* Slow path (hypercall) if this is a non-local port or if this is + * an hvm domain and an event is pending (hvm domains don't have + * their own implementation of irq_enable). */ + if (do_hypercall) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } else { + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + + /* + * The following is basically the equivalent of + * 'hw_resend_irq'. Just like a real IO-APIC we 'lose + * the interrupt edge' if the channel is masked. + */ + if (evtchn_pending && + !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD, + BM(&vcpu_info->evtchn_pending_sel))) + vcpu_info->evtchn_upcall_pending = 1; + } + + put_cpu(); +} + +static DEFINE_PER_CPU(unsigned int, current_word_idx); +static DEFINE_PER_CPU(unsigned int, current_bit_idx); + +/* + * Mask out the i least significant bits of w + */ +#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i)) + +static inline xen_ulong_t active_evtchns(unsigned int cpu, + struct shared_info *sh, + unsigned int idx) +{ + return sh->evtchn_pending[idx] & + per_cpu(cpu_evtchn_mask, cpu)[idx] & + ~sh->evtchn_mask[idx]; +} + +/* + * Search the CPU's pending events bitmasks. For each one found, map + * the event number to an irq, and feed it into do_IRQ() for handling. + * + * Xen uses a two-level bitmap to speed searching. The first level is + * a bitset of words which contain pending event bits. The second + * level is a bitset of pending events themselves. + */ +void xen_evtchn_handle_events(int cpu) +{ + int irq; + xen_ulong_t pending_words; + xen_ulong_t pending_bits; + int start_word_idx, start_bit_idx; + int word_idx, bit_idx; + int i; + struct irq_desc *desc; + struct shared_info *s = HYPERVISOR_shared_info; + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + + /* Timer interrupt has highest priority. */ + irq = irq_from_virq(cpu, VIRQ_TIMER); + if (irq != -1) { + unsigned int evtchn = evtchn_from_irq(irq); + word_idx = evtchn / BITS_PER_LONG; + bit_idx = evtchn % BITS_PER_LONG; + if (active_evtchns(cpu, s, word_idx) & (1ULL << bit_idx)) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } + } + + /* + * Master flag must be cleared /before/ clearing + * selector flag. xchg_xen_ulong must contain an + * appropriate barrier. + */ + pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0); + + start_word_idx = __this_cpu_read(current_word_idx); + start_bit_idx = __this_cpu_read(current_bit_idx); + + word_idx = start_word_idx; + + for (i = 0; pending_words != 0; i++) { + xen_ulong_t words; + + words = MASK_LSBS(pending_words, word_idx); + + /* + * If we masked out all events, wrap to beginning. + */ + if (words == 0) { + word_idx = 0; + bit_idx = 0; + continue; + } + word_idx = EVTCHN_FIRST_BIT(words); + + pending_bits = active_evtchns(cpu, s, word_idx); + bit_idx = 0; /* usually scan entire word from start */ + /* + * We scan the starting word in two parts. + * + * 1st time: start in the middle, scanning the + * upper bits. + * + * 2nd time: scan the whole word (not just the + * parts skipped in the first pass) -- if an + * event in the previously scanned bits is + * pending again it would just be scanned on + * the next loop anyway. + */ + if (word_idx == start_word_idx) { + if (i == 0) + bit_idx = start_bit_idx; + } + + do { + xen_ulong_t bits; + int port; + + bits = MASK_LSBS(pending_bits, bit_idx); + + /* If we masked out all events, move on. */ + if (bits == 0) + break; + + bit_idx = EVTCHN_FIRST_BIT(bits); + + /* Process port. */ + port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; + irq = evtchn_to_irq[port]; + + if (irq != -1) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } + + bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; + + /* Next caller starts at last processed + 1 */ + __this_cpu_write(current_word_idx, + bit_idx ? word_idx : + (word_idx+1) % BITS_PER_EVTCHN_WORD); + __this_cpu_write(current_bit_idx, bit_idx); + } while (bit_idx != 0); + + /* Scan start_l1i twice; all others once. */ + if ((word_idx != start_word_idx) || (i != 0)) + pending_words &= ~(1UL << word_idx); + + word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD; + } +} + +irqreturn_t xen_debug_interrupt(int irq, void *dev_id) +{ + struct shared_info *sh = HYPERVISOR_shared_info; + int cpu = smp_processor_id(); + xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); + int i; + unsigned long flags; + static DEFINE_SPINLOCK(debug_lock); + struct vcpu_info *v; + + spin_lock_irqsave(&debug_lock, flags); + + printk("\nvcpu %d\n ", cpu); + + for_each_online_cpu(i) { + int pending; + v = per_cpu(xen_vcpu, i); + pending = (get_irq_regs() && i == cpu) + ? xen_irqs_disabled(get_irq_regs()) + : v->evtchn_upcall_mask; + printk("%d: masked=%d pending=%d event_sel %0*"PRI_xen_ulong"\n ", i, + pending, v->evtchn_upcall_pending, + (int)(sizeof(v->evtchn_pending_sel)*2), + v->evtchn_pending_sel); + } + v = per_cpu(xen_vcpu, cpu); + + printk("\npending:\n "); + for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)sizeof(sh->evtchn_pending[0])*2, + sh->evtchn_pending[i], + i % 8 == 0 ? "\n " : " "); + printk("\nglobal mask:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + sh->evtchn_mask[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nglobally unmasked:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + sh->evtchn_pending[i] & ~sh->evtchn_mask[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nlocal cpu%d mask:\n ", cpu); + for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) + printk("%0*"PRI_xen_ulong"%s", (int)(sizeof(cpu_evtchn[0])*2), + cpu_evtchn[i], + i % 8 == 0 ? "\n " : " "); + + printk("\nlocally unmasked:\n "); + for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { + xen_ulong_t pending = sh->evtchn_pending[i] + & ~sh->evtchn_mask[i] + & cpu_evtchn[i]; + printk("%0*"PRI_xen_ulong"%s", + (int)(sizeof(sh->evtchn_mask[0])*2), + pending, i % 8 == 0 ? "\n " : " "); + } + + printk("\npending list:\n"); + for (i = 0; i < NR_EVENT_CHANNELS; i++) { + if (sync_test_bit(i, BM(sh->evtchn_pending))) { + int word_idx = i / BITS_PER_EVTCHN_WORD; + printk(" %d: event %d -> irq %d%s%s%s\n", + cpu_from_evtchn(i), i, + evtchn_to_irq[i], + sync_test_bit(word_idx, BM(&v->evtchn_pending_sel)) + ? "" : " l2-clear", + !sync_test_bit(i, BM(sh->evtchn_mask)) + ? "" : " globally-masked", + sync_test_bit(i, BM(cpu_evtchn)) + ? "" : " locally-masked"); + } + } + + spin_unlock_irqrestore(&debug_lock, flags); + + return IRQ_HANDLED; +} diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index fec5da4ff3a0..8771b740e30f 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -59,6 +59,8 @@ #include #include +#include "events_internal.h" + /* * This lock protects updates to the following mapping and reference-count * arrays. The lock does not need to be acquired to read the mapping tables. @@ -73,72 +75,12 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; /* IRQ <-> IPI mapping */ static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1}; -/* Interrupt types. */ -enum xen_irq_type { - IRQT_UNBOUND = 0, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_EVTCHN -}; - -/* - * Packed IRQ information: - * type - enum xen_irq_type - * event channel - irq->event channel mapping - * cpu - cpu this event channel is bound to - * index - type-specific information: - * PIRQ - physical IRQ, GSI, flags, and owner domain - * VIRQ - virq number - * IPI - IPI vector - * EVTCHN - - */ -struct irq_info { - struct list_head list; - int refcnt; - enum xen_irq_type type; /* type */ - unsigned irq; - unsigned short evtchn; /* event channel */ - unsigned short cpu; /* cpu bound */ - - union { - unsigned short virq; - enum ipi_vector ipi; - struct { - unsigned short pirq; - unsigned short gsi; - unsigned char flags; - uint16_t domid; - } pirq; - } u; -}; -#define PIRQ_NEEDS_EOI (1 << 0) -#define PIRQ_SHAREABLE (1 << 1) - -static int *evtchn_to_irq; +int *evtchn_to_irq; #ifdef CONFIG_X86 static unsigned long *pirq_eoi_map; #endif static bool (*pirq_needs_eoi)(unsigned irq); -/* - * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be - * careful to only use bitops which allow for this (e.g - * test_bit/find_first_bit and friends but not __ffs) and to pass - * BITS_PER_EVTCHN_WORD as the bitmask length. - */ -#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8) -/* - * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t - * array. Primarily to avoid long lines (hence the terse name). - */ -#define BM(x) (unsigned long *)(x) -/* Find the first set bit in a evtchn mask */ -#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) - -static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], - cpu_evtchn_mask); - /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) @@ -149,7 +91,7 @@ static void enable_dynirq(struct irq_data *data); static void disable_dynirq(struct irq_data *data); /* Get info for IRQ */ -static struct irq_info *info_for_irq(unsigned irq) +struct irq_info *info_for_irq(unsigned irq) { return irq_get_handler_data(irq); } @@ -230,7 +172,7 @@ static void xen_irq_info_pirq_init(unsigned irq, /* * Accessors for packed IRQ information. */ -static unsigned int evtchn_from_irq(unsigned irq) +unsigned int evtchn_from_irq(unsigned irq) { if (unlikely(WARN(irq < 0 || irq >= nr_irqs, "Invalid irq %d!\n", irq))) return 0; @@ -244,6 +186,11 @@ unsigned irq_from_evtchn(unsigned int evtchn) } EXPORT_SYMBOL_GPL(irq_from_evtchn); +int irq_from_virq(unsigned int cpu, unsigned int virq) +{ + return per_cpu(virq_to_irq, cpu)[virq]; +} + static enum ipi_vector ipi_from_irq(unsigned irq) { struct irq_info *info = info_for_irq(irq); @@ -279,12 +226,12 @@ static enum xen_irq_type type_from_irq(unsigned irq) return info_for_irq(irq)->type; } -static unsigned cpu_from_irq(unsigned irq) +unsigned cpu_from_irq(unsigned irq) { return info_for_irq(irq)->cpu; } -static unsigned int cpu_from_evtchn(unsigned int evtchn) +unsigned int cpu_from_evtchn(unsigned int evtchn) { int irq = evtchn_to_irq[evtchn]; unsigned ret = 0; @@ -310,55 +257,21 @@ static bool pirq_needs_eoi_flag(unsigned irq) return info->u.pirq.flags & PIRQ_NEEDS_EOI; } -static inline xen_ulong_t active_evtchns(unsigned int cpu, - struct shared_info *sh, - unsigned int idx) -{ - return sh->evtchn_pending[idx] & - per_cpu(cpu_evtchn_mask, cpu)[idx] & - ~sh->evtchn_mask[idx]; -} - static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) { int irq = evtchn_to_irq[chn]; + struct irq_info *info = info_for_irq(irq); BUG_ON(irq == -1); #ifdef CONFIG_SMP cpumask_copy(irq_to_desc(irq)->irq_data.affinity, cpumask_of(cpu)); #endif - clear_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu_from_irq(irq)))); - set_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu))); + xen_evtchn_port_bind_to_cpu(info, cpu); - info_for_irq(irq)->cpu = cpu; -} - -static inline void clear_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline void set_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline int test_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_bit(port, BM(&s->evtchn_pending[0])); -} - -static inline int test_and_set_mask(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); + info->cpu = cpu; } - /** * notify_remote_via_irq - send event to remote end of event channel via irq * @irq: irq of event channel to send event to @@ -376,63 +289,6 @@ void notify_remote_via_irq(int irq) } EXPORT_SYMBOL_GPL(notify_remote_via_irq); -static void mask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, BM(&s->evtchn_mask[0])); -} - -static void unmask_evtchn(int port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - unsigned int cpu = get_cpu(); - int do_hypercall = 0, evtchn_pending = 0; - - BUG_ON(!irqs_disabled()); - - if (unlikely((cpu != cpu_from_evtchn(port)))) - do_hypercall = 1; - else { - /* - * Need to clear the mask before checking pending to - * avoid a race with an event becoming pending. - * - * EVTCHNOP_unmask will only trigger an upcall if the - * mask bit was set, so if a hypercall is needed - * remask the event. - */ - sync_clear_bit(port, BM(&s->evtchn_mask[0])); - evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0])); - - if (unlikely(evtchn_pending && xen_hvm_domain())) { - sync_set_bit(port, BM(&s->evtchn_mask[0])); - do_hypercall = 1; - } - } - - /* Slow path (hypercall) if this is a non-local port or if this is - * an hvm domain and an event is pending (hvm domains don't have - * their own implementation of irq_enable). */ - if (do_hypercall) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - } else { - struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); - - /* - * The following is basically the equivalent of - * 'hw_resend_irq'. Just like a real IO-APIC we 'lose - * the interrupt edge' if the channel is masked. - */ - if (evtchn_pending && - !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD, - BM(&vcpu_info->evtchn_pending_sel))) - vcpu_info->evtchn_upcall_pending = 1; - } - - put_cpu(); -} - static void xen_irq_init(unsigned irq) { struct irq_info *info; @@ -1216,222 +1072,21 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) notify_remote_via_irq(irq); } -irqreturn_t xen_debug_interrupt(int irq, void *dev_id) -{ - struct shared_info *sh = HYPERVISOR_shared_info; - int cpu = smp_processor_id(); - xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); - int i; - unsigned long flags; - static DEFINE_SPINLOCK(debug_lock); - struct vcpu_info *v; - - spin_lock_irqsave(&debug_lock, flags); - - printk("\nvcpu %d\n ", cpu); - - for_each_online_cpu(i) { - int pending; - v = per_cpu(xen_vcpu, i); - pending = (get_irq_regs() && i == cpu) - ? xen_irqs_disabled(get_irq_regs()) - : v->evtchn_upcall_mask; - printk("%d: masked=%d pending=%d event_sel %0*"PRI_xen_ulong"\n ", i, - pending, v->evtchn_upcall_pending, - (int)(sizeof(v->evtchn_pending_sel)*2), - v->evtchn_pending_sel); - } - v = per_cpu(xen_vcpu, cpu); - - printk("\npending:\n "); - for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)sizeof(sh->evtchn_pending[0])*2, - sh->evtchn_pending[i], - i % 8 == 0 ? "\n " : " "); - printk("\nglobal mask:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_mask[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nglobally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_pending[i] & ~sh->evtchn_mask[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nlocal cpu%d mask:\n ", cpu); - for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) - printk("%0*"PRI_xen_ulong"%s", (int)(sizeof(cpu_evtchn[0])*2), - cpu_evtchn[i], - i % 8 == 0 ? "\n " : " "); - - printk("\nlocally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { - xen_ulong_t pending = sh->evtchn_pending[i] - & ~sh->evtchn_mask[i] - & cpu_evtchn[i]; - printk("%0*"PRI_xen_ulong"%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - pending, i % 8 == 0 ? "\n " : " "); - } - - printk("\npending list:\n"); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { - if (sync_test_bit(i, BM(sh->evtchn_pending))) { - int word_idx = i / BITS_PER_EVTCHN_WORD; - printk(" %d: event %d -> irq %d%s%s%s\n", - cpu_from_evtchn(i), i, - evtchn_to_irq[i], - sync_test_bit(word_idx, BM(&v->evtchn_pending_sel)) - ? "" : " l2-clear", - !sync_test_bit(i, BM(sh->evtchn_mask)) - ? "" : " globally-masked", - sync_test_bit(i, BM(cpu_evtchn)) - ? "" : " locally-masked"); - } - } - - spin_unlock_irqrestore(&debug_lock, flags); - - return IRQ_HANDLED; -} - static DEFINE_PER_CPU(unsigned, xed_nesting_count); -static DEFINE_PER_CPU(unsigned int, current_word_idx); -static DEFINE_PER_CPU(unsigned int, current_bit_idx); -/* - * Mask out the i least significant bits of w - */ -#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i)) - -/* - * Search the CPUs pending events bitmasks. For each one found, map - * the event number to an irq, and feed it into do_IRQ() for - * handling. - * - * Xen uses a two-level bitmap to speed searching. The first level is - * a bitset of words which contain pending event bits. The second - * level is a bitset of pending events themselves. - */ static void __xen_evtchn_do_upcall(void) { - int start_word_idx, start_bit_idx; - int word_idx, bit_idx; - int i, irq; - int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + int cpu = get_cpu(); unsigned count; do { - xen_ulong_t pending_words; - xen_ulong_t pending_bits; - struct irq_desc *desc; - vcpu_info->evtchn_upcall_pending = 0; if (__this_cpu_inc_return(xed_nesting_count) - 1) goto out; - /* - * Master flag must be cleared /before/ clearing - * selector flag. xchg_xen_ulong must contain an - * appropriate barrier. - */ - if ((irq = per_cpu(virq_to_irq, cpu)[VIRQ_TIMER]) != -1) { - int evtchn = evtchn_from_irq(irq); - word_idx = evtchn / BITS_PER_LONG; - pending_bits = evtchn % BITS_PER_LONG; - if (active_evtchns(cpu, s, word_idx) & (1ULL << pending_bits)) { - desc = irq_to_desc(irq); - if (desc) - generic_handle_irq_desc(irq, desc); - } - } - - pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0); - - start_word_idx = __this_cpu_read(current_word_idx); - start_bit_idx = __this_cpu_read(current_bit_idx); - - word_idx = start_word_idx; - - for (i = 0; pending_words != 0; i++) { - xen_ulong_t words; - - words = MASK_LSBS(pending_words, word_idx); - - /* - * If we masked out all events, wrap to beginning. - */ - if (words == 0) { - word_idx = 0; - bit_idx = 0; - continue; - } - word_idx = EVTCHN_FIRST_BIT(words); - - pending_bits = active_evtchns(cpu, s, word_idx); - bit_idx = 0; /* usually scan entire word from start */ - /* - * We scan the starting word in two parts. - * - * 1st time: start in the middle, scanning the - * upper bits. - * - * 2nd time: scan the whole word (not just the - * parts skipped in the first pass) -- if an - * event in the previously scanned bits is - * pending again it would just be scanned on - * the next loop anyway. - */ - if (word_idx == start_word_idx) { - if (i == 0) - bit_idx = start_bit_idx; - } - - do { - xen_ulong_t bits; - int port; - - bits = MASK_LSBS(pending_bits, bit_idx); - - /* If we masked out all events, move on. */ - if (bits == 0) - break; - - bit_idx = EVTCHN_FIRST_BIT(bits); - - /* Process port. */ - port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; - irq = evtchn_to_irq[port]; - - if (irq != -1) { - desc = irq_to_desc(irq); - if (desc) - generic_handle_irq_desc(irq, desc); - } - - bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; - - /* Next caller starts at last processed + 1 */ - __this_cpu_write(current_word_idx, - bit_idx ? word_idx : - (word_idx+1) % BITS_PER_EVTCHN_WORD); - __this_cpu_write(current_bit_idx, bit_idx); - } while (bit_idx != 0); - - /* Scan start_l1i twice; all others once. */ - if ((word_idx != start_word_idx) || (i != 0)) - pending_words &= ~(1UL << word_idx); - - word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD; - } + xen_evtchn_handle_events(cpu); BUG_ON(!irqs_disabled()); diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h new file mode 100644 index 000000000000..79ac70bbbd26 --- /dev/null +++ b/drivers/xen/events/events_internal.h @@ -0,0 +1,74 @@ +/* + * Xen Event Channels (internal header) + * + * Copyright (C) 2013 Citrix Systems R&D Ltd. + * + * This source code is licensed under the GNU General Public License, + * Version 2 or later. See the file COPYING for more details. + */ +#ifndef __EVENTS_INTERNAL_H__ +#define __EVENTS_INTERNAL_H__ + +/* Interrupt types. */ +enum xen_irq_type { + IRQT_UNBOUND = 0, + IRQT_PIRQ, + IRQT_VIRQ, + IRQT_IPI, + IRQT_EVTCHN +}; + +/* + * Packed IRQ information: + * type - enum xen_irq_type + * event channel - irq->event channel mapping + * cpu - cpu this event channel is bound to + * index - type-specific information: + * PIRQ - vector, with MSB being "needs EIO", or physical IRQ of the HVM + * guest, or GSI (real passthrough IRQ) of the device. + * VIRQ - virq number + * IPI - IPI vector + * EVTCHN - + */ +struct irq_info { + struct list_head list; + int refcnt; + enum xen_irq_type type; /* type */ + unsigned irq; + unsigned short evtchn; /* event channel */ + unsigned short cpu; /* cpu bound */ + + union { + unsigned short virq; + enum ipi_vector ipi; + struct { + unsigned short pirq; + unsigned short gsi; + unsigned char vector; + unsigned char flags; + uint16_t domid; + } pirq; + } u; +}; + +#define PIRQ_NEEDS_EOI (1 << 0) +#define PIRQ_SHAREABLE (1 << 1) + +extern int *evtchn_to_irq; + +struct irq_info *info_for_irq(unsigned irq); +unsigned cpu_from_irq(unsigned irq); +unsigned cpu_from_evtchn(unsigned int evtchn); + +void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu); + +void clear_evtchn(int port); +void set_evtchn(int port); +int test_evtchn(int port); +int test_and_set_mask(int port); +void mask_evtchn(int port); +void unmask_evtchn(int port); + +void xen_evtchn_handle_events(int cpu); + +#endif /* #ifndef __EVENTS_INTERNAL_H__ */ -- cgit v1.2.1 From ab9a1cca3d172876ae9d5edb63abce7986045597 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 14 Mar 2013 12:49:19 +0000 Subject: xen/events: add struct evtchn_ops for the low-level port operations evtchn_ops contains the low-level operations that access the shared data structures. This allows alternate ABIs to be supported. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_2l.c | 33 ++++++++++++++----- drivers/xen/events/events_base.c | 4 +++ drivers/xen/events/events_internal.h | 63 +++++++++++++++++++++++++++++++----- 3 files changed, 84 insertions(+), 16 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index a77e98d025fa..e55677cca745 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -41,43 +41,43 @@ static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], cpu_evtchn_mask); -void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu) +static void evtchn_2l_bind_to_cpu(struct irq_info *info, unsigned cpu) { clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu))); set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu))); } -void clear_evtchn(int port) +static void evtchn_2l_clear_pending(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; sync_clear_bit(port, BM(&s->evtchn_pending[0])); } -void set_evtchn(int port) +static void evtchn_2l_set_pending(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; sync_set_bit(port, BM(&s->evtchn_pending[0])); } -int test_evtchn(int port) +static bool evtchn_2l_is_pending(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; return sync_test_bit(port, BM(&s->evtchn_pending[0])); } -int test_and_set_mask(int port) +static bool evtchn_2l_test_and_set_mask(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); } -void mask_evtchn(int port) +static void evtchn_2l_mask(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; sync_set_bit(port, BM(&s->evtchn_mask[0])); } -void unmask_evtchn(int port) +static void evtchn_2l_unmask(unsigned port) { struct shared_info *s = HYPERVISOR_shared_info; unsigned int cpu = get_cpu(); @@ -153,7 +153,7 @@ static inline xen_ulong_t active_evtchns(unsigned int cpu, * a bitset of words which contain pending event bits. The second * level is a bitset of pending events themselves. */ -void xen_evtchn_handle_events(int cpu) +static void evtchn_2l_handle_events(unsigned cpu) { int irq; xen_ulong_t pending_words; @@ -346,3 +346,20 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } + +static const struct evtchn_ops evtchn_ops_2l = { + .bind_to_cpu = evtchn_2l_bind_to_cpu, + .clear_pending = evtchn_2l_clear_pending, + .set_pending = evtchn_2l_set_pending, + .is_pending = evtchn_2l_is_pending, + .test_and_set_mask = evtchn_2l_test_and_set_mask, + .mask = evtchn_2l_mask, + .unmask = evtchn_2l_unmask, + .handle_events = evtchn_2l_handle_events, +}; + +void __init xen_evtchn_2l_init(void) +{ + pr_info("Using 2-level ABI\n"); + evtchn_ops = &evtchn_ops_2l; +} diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 8771b740e30f..7c7b744cd13d 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -61,6 +61,8 @@ #include "events_internal.h" +const struct evtchn_ops *evtchn_ops; + /* * This lock protects updates to the following mapping and reference-count * arrays. The lock does not need to be acquired to read the mapping tables. @@ -1523,6 +1525,8 @@ void __init xen_init_IRQ(void) { int i; + xen_evtchn_2l_init(); + evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), GFP_KERNEL); BUG_ON(!evtchn_to_irq); diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index 79ac70bbbd26..ba8142f0c635 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -54,21 +54,68 @@ struct irq_info { #define PIRQ_NEEDS_EOI (1 << 0) #define PIRQ_SHAREABLE (1 << 1) +struct evtchn_ops { + void (*bind_to_cpu)(struct irq_info *info, unsigned cpu); + + void (*clear_pending)(unsigned port); + void (*set_pending)(unsigned port); + bool (*is_pending)(unsigned port); + bool (*test_and_set_mask)(unsigned port); + void (*mask)(unsigned port); + void (*unmask)(unsigned port); + + void (*handle_events)(unsigned cpu); +}; + +extern const struct evtchn_ops *evtchn_ops; + extern int *evtchn_to_irq; struct irq_info *info_for_irq(unsigned irq); unsigned cpu_from_irq(unsigned irq); unsigned cpu_from_evtchn(unsigned int evtchn); -void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu); +static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, + unsigned cpu) +{ + evtchn_ops->bind_to_cpu(info, cpu); +} + +static inline void clear_evtchn(unsigned port) +{ + evtchn_ops->clear_pending(port); +} + +static inline void set_evtchn(unsigned port) +{ + evtchn_ops->set_pending(port); +} + +static inline bool test_evtchn(unsigned port) +{ + return evtchn_ops->is_pending(port); +} + +static inline bool test_and_set_mask(unsigned port) +{ + return evtchn_ops->test_and_set_mask(port); +} + +static inline void mask_evtchn(unsigned port) +{ + return evtchn_ops->mask(port); +} + +static inline void unmask_evtchn(unsigned port) +{ + return evtchn_ops->unmask(port); +} -void clear_evtchn(int port); -void set_evtchn(int port); -int test_evtchn(int port); -int test_and_set_mask(int port); -void mask_evtchn(int port); -void unmask_evtchn(int port); +static inline void xen_evtchn_handle_events(unsigned cpu) +{ + return evtchn_ops->handle_events(cpu); +} -void xen_evtchn_handle_events(int cpu); +void xen_evtchn_2l_init(void); #endif /* #ifndef __EVENTS_INTERNAL_H__ */ -- cgit v1.2.1 From 96d4c5881806ebb993a3d84991af9c96fa9cd576 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 18 Mar 2013 15:50:17 +0000 Subject: xen/events: allow setup of irq_info to fail The FIFO-based event ABI requires additional setup of newly bound events (it may need to expand the event array) and this setup may fail. xen_irq_info_common_init() is a useful place to put this setup so allow this call to fail. This call and the other similar calls are renamed to be *_setup() to reflect that they may now fail. This failure can only occur with new event channels not on rebind. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_base.c | 156 +++++++++++++++++++++++---------------- 1 file changed, 91 insertions(+), 65 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 7c7b744cd13d..4f7d94abe82c 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -99,7 +99,7 @@ struct irq_info *info_for_irq(unsigned irq) } /* Constructors for packed IRQ information. */ -static void xen_irq_info_common_init(struct irq_info *info, +static int xen_irq_info_common_setup(struct irq_info *info, unsigned irq, enum xen_irq_type type, unsigned short evtchn, @@ -116,45 +116,47 @@ static void xen_irq_info_common_init(struct irq_info *info, evtchn_to_irq[evtchn] = irq; irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN); + + return 0; } -static void xen_irq_info_evtchn_init(unsigned irq, +static int xen_irq_info_evtchn_setup(unsigned irq, unsigned short evtchn) { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0); + return xen_irq_info_common_setup(info, irq, IRQT_EVTCHN, evtchn, 0); } -static void xen_irq_info_ipi_init(unsigned cpu, +static int xen_irq_info_ipi_setup(unsigned cpu, unsigned irq, unsigned short evtchn, enum ipi_vector ipi) { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0); - info->u.ipi = ipi; per_cpu(ipi_to_irq, cpu)[ipi] = irq; + + return xen_irq_info_common_setup(info, irq, IRQT_IPI, evtchn, 0); } -static void xen_irq_info_virq_init(unsigned cpu, +static int xen_irq_info_virq_setup(unsigned cpu, unsigned irq, unsigned short evtchn, unsigned short virq) { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0); - info->u.virq = virq; per_cpu(virq_to_irq, cpu)[virq] = irq; + + return xen_irq_info_common_setup(info, irq, IRQT_VIRQ, evtchn, 0); } -static void xen_irq_info_pirq_init(unsigned irq, +static int xen_irq_info_pirq_setup(unsigned irq, unsigned short evtchn, unsigned short pirq, unsigned short gsi, @@ -163,12 +165,12 @@ static void xen_irq_info_pirq_init(unsigned irq, { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0); - info->u.pirq.pirq = pirq; info->u.pirq.gsi = gsi; info->u.pirq.domid = domid; info->u.pirq.flags = flags; + + return xen_irq_info_common_setup(info, irq, IRQT_PIRQ, evtchn, 0); } /* @@ -521,6 +523,47 @@ int xen_irq_from_gsi(unsigned gsi) } EXPORT_SYMBOL_GPL(xen_irq_from_gsi); +static void __unbind_from_irq(unsigned int irq) +{ + struct evtchn_close close; + int evtchn = evtchn_from_irq(irq); + struct irq_info *info = irq_get_handler_data(irq); + + if (info->refcnt > 0) { + info->refcnt--; + if (info->refcnt != 0) + return; + } + + if (VALID_EVTCHN(evtchn)) { + close.port = evtchn; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + switch (type_from_irq(irq)) { + case IRQT_VIRQ: + per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) + [virq_from_irq(irq)] = -1; + break; + case IRQT_IPI: + per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn)) + [ipi_from_irq(irq)] = -1; + break; + default: + break; + } + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(evtchn, 0); + + evtchn_to_irq[evtchn] = -1; + } + + BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); + + xen_free_irq(irq); +} + /* * Do not make any assumptions regarding the relationship between the * IRQ number returned here and the Xen pirq argument. @@ -536,6 +579,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, { int irq = -1; struct physdev_irq irq_op; + int ret; mutex_lock(&irq_mapping_update_lock); @@ -563,8 +607,13 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, goto out; } - xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF, + ret = xen_irq_info_pirq_setup(irq, 0, pirq, gsi, DOMID_SELF, shareable ? PIRQ_SHAREABLE : 0); + if (ret < 0) { + __unbind_from_irq(irq); + irq = ret; + goto out; + } pirq_query_unmask(irq); /* We try to use the handler with the appropriate semantic for the @@ -624,7 +673,9 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc, irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, name); - xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0); + ret = xen_irq_info_pirq_setup(irq, 0, pirq, 0, domid, 0); + if (ret < 0) + goto error_irq; ret = irq_set_msi_desc(irq, msidesc); if (ret < 0) goto error_irq; @@ -632,8 +683,8 @@ out: mutex_unlock(&irq_mapping_update_lock); return irq; error_irq: + __unbind_from_irq(irq); mutex_unlock(&irq_mapping_update_lock); - xen_free_irq(irq); return ret; } #endif @@ -703,9 +754,11 @@ int xen_pirq_from_irq(unsigned irq) return pirq_from_irq(irq); } EXPORT_SYMBOL_GPL(xen_pirq_from_irq); + int bind_evtchn_to_irq(unsigned int evtchn) { int irq; + int ret; mutex_lock(&irq_mapping_update_lock); @@ -719,7 +772,12 @@ int bind_evtchn_to_irq(unsigned int evtchn) irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, handle_edge_irq, "event"); - xen_irq_info_evtchn_init(irq, evtchn); + ret = xen_irq_info_evtchn_setup(irq, evtchn); + if (ret < 0) { + __unbind_from_irq(irq); + irq = ret; + goto out; + } } else { struct irq_info *info = info_for_irq(irq); WARN_ON(info == NULL || info->type != IRQT_EVTCHN); @@ -736,6 +794,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) { struct evtchn_bind_ipi bind_ipi; int evtchn, irq; + int ret; mutex_lock(&irq_mapping_update_lock); @@ -755,8 +814,12 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) BUG(); evtchn = bind_ipi.port; - xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); - + ret = xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi); + if (ret < 0) { + __unbind_from_irq(irq); + irq = ret; + goto out; + } bind_evtchn_to_cpu(evtchn, cpu); } else { struct irq_info *info = info_for_irq(irq); @@ -835,7 +898,12 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) evtchn = ret; } - xen_irq_info_virq_init(cpu, irq, evtchn, virq); + ret = xen_irq_info_virq_setup(cpu, irq, evtchn, virq); + if (ret < 0) { + __unbind_from_irq(irq); + irq = ret; + goto out; + } bind_evtchn_to_cpu(evtchn, cpu); } else { @@ -851,50 +919,8 @@ out: static void unbind_from_irq(unsigned int irq) { - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - struct irq_info *info = irq_get_handler_data(irq); - - if (WARN_ON(!info)) - return; - mutex_lock(&irq_mapping_update_lock); - - if (info->refcnt > 0) { - info->refcnt--; - if (info->refcnt != 0) - goto done; - } - - if (VALID_EVTCHN(evtchn)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [virq_from_irq(irq)] = -1; - break; - case IRQT_IPI: - per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn)) - [ipi_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - } - - BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); - - xen_free_irq(irq); - - done: + __unbind_from_irq(irq); mutex_unlock(&irq_mapping_update_lock); } @@ -1142,7 +1168,7 @@ void rebind_evtchn_irq(int evtchn, int irq) so there should be a proper type */ BUG_ON(info->type == IRQT_UNBOUND); - xen_irq_info_evtchn_init(irq, evtchn); + (void)xen_irq_info_evtchn_setup(irq, evtchn); mutex_unlock(&irq_mapping_update_lock); @@ -1317,7 +1343,7 @@ static void restore_cpu_virqs(unsigned int cpu) evtchn = bind_virq.port; /* Record the new mapping. */ - xen_irq_info_virq_init(cpu, irq, evtchn, virq); + (void)xen_irq_info_virq_setup(cpu, irq, evtchn, virq); bind_evtchn_to_cpu(evtchn, cpu); } } @@ -1341,7 +1367,7 @@ static void restore_cpu_ipis(unsigned int cpu) evtchn = bind_ipi.port; /* Record the new mapping. */ - xen_irq_info_ipi_init(cpu, irq, evtchn, ipi); + (void)xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi); bind_evtchn_to_cpu(evtchn, cpu); } } -- cgit v1.2.1 From 083858758f67bb20ef6be5bc8442be91cca8ee2d Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 18 Mar 2013 16:54:57 +0000 Subject: xen/events: add a evtchn_op for port setup Add a hook for port-specific setup and call it from xen_irq_info_common_setup(). The FIFO-based ABIs may need to perform additional setup (expanding the event array) before a bound event channel can start to receive events. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_base.c | 2 +- drivers/xen/events/events_internal.h | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 4f7d94abe82c..929eccb77270 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -117,7 +117,7 @@ static int xen_irq_info_common_setup(struct irq_info *info, irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN); - return 0; + return xen_evtchn_port_setup(info); } static int xen_irq_info_evtchn_setup(unsigned irq, diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index ba8142f0c635..dc9650265e04 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -55,6 +55,7 @@ struct irq_info { #define PIRQ_SHAREABLE (1 << 1) struct evtchn_ops { + int (*setup)(struct irq_info *info); void (*bind_to_cpu)(struct irq_info *info, unsigned cpu); void (*clear_pending)(unsigned port); @@ -75,6 +76,17 @@ struct irq_info *info_for_irq(unsigned irq); unsigned cpu_from_irq(unsigned irq); unsigned cpu_from_evtchn(unsigned int evtchn); +/* + * Do any ABI specific setup for a bound event channel before it can + * be unmasked and used. + */ +static inline int xen_evtchn_port_setup(struct irq_info *info) +{ + if (evtchn_ops->setup) + return evtchn_ops->setup(info); + return 0; +} + static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, unsigned cpu) { -- cgit v1.2.1 From d0b075ffeede257342c3afdbeadd2fda8504ecee Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 17 Oct 2013 15:23:15 +0100 Subject: xen/events: Refactor evtchn_to_irq array to be dynamically allocated Refactor static array evtchn_to_irq array to be dynamically allocated by implementing get and set functions for accesses to the array. Two new port ops are added: max_channels (maximum supported number of event channels) and nr_channels (number of currently usable event channels). For the 2-level ABI, these numbers are both the same as the shared data structure is a fixed size. For the FIFO ABI, these will be different as the event array is expanded dynamically. This allows more than 65000 event channels so an unsigned short is no longer sufficient for an event channel port number and unsigned int is used instead. Signed-off-by: Malcolm Crossley Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_2l.c | 11 ++- drivers/xen/events/events_base.c | 175 +++++++++++++++++++++++++---------- drivers/xen/events/events_internal.h | 18 +++- 3 files changed, 149 insertions(+), 55 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index e55677cca745..ecb402a149e3 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -41,6 +41,11 @@ static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], cpu_evtchn_mask); +static unsigned evtchn_2l_max_channels(void) +{ + return NR_EVENT_CHANNELS; +} + static void evtchn_2l_bind_to_cpu(struct irq_info *info, unsigned cpu) { clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu))); @@ -238,7 +243,7 @@ static void evtchn_2l_handle_events(unsigned cpu) /* Process port. */ port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; - irq = evtchn_to_irq[port]; + irq = get_evtchn_to_irq(port); if (irq != -1) { desc = irq_to_desc(irq); @@ -332,7 +337,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) int word_idx = i / BITS_PER_EVTCHN_WORD; printk(" %d: event %d -> irq %d%s%s%s\n", cpu_from_evtchn(i), i, - evtchn_to_irq[i], + get_evtchn_to_irq(i), sync_test_bit(word_idx, BM(&v->evtchn_pending_sel)) ? "" : " l2-clear", !sync_test_bit(i, BM(sh->evtchn_mask)) @@ -348,6 +353,8 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) } static const struct evtchn_ops evtchn_ops_2l = { + .max_channels = evtchn_2l_max_channels, + .nr_channels = evtchn_2l_max_channels, .bind_to_cpu = evtchn_2l_bind_to_cpu, .clear_pending = evtchn_2l_clear_pending, .set_pending = evtchn_2l_set_pending, diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 929eccb77270..a6906665de53 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -77,12 +77,16 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; /* IRQ <-> IPI mapping */ static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1}; -int *evtchn_to_irq; +int **evtchn_to_irq; #ifdef CONFIG_X86 static unsigned long *pirq_eoi_map; #endif static bool (*pirq_needs_eoi)(unsigned irq); +#define EVTCHN_ROW(e) (e / (PAGE_SIZE/sizeof(**evtchn_to_irq))) +#define EVTCHN_COL(e) (e % (PAGE_SIZE/sizeof(**evtchn_to_irq))) +#define EVTCHN_PER_ROW (PAGE_SIZE / sizeof(**evtchn_to_irq)) + /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) @@ -92,6 +96,61 @@ static struct irq_chip xen_pirq_chip; static void enable_dynirq(struct irq_data *data); static void disable_dynirq(struct irq_data *data); +static void clear_evtchn_to_irq_row(unsigned row) +{ + unsigned col; + + for (col = 0; col < EVTCHN_PER_ROW; col++) + evtchn_to_irq[row][col] = -1; +} + +static void clear_evtchn_to_irq_all(void) +{ + unsigned row; + + for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) { + if (evtchn_to_irq[row] == NULL) + continue; + clear_evtchn_to_irq_row(row); + } +} + +static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) +{ + unsigned row; + unsigned col; + + if (evtchn >= xen_evtchn_max_channels()) + return -EINVAL; + + row = EVTCHN_ROW(evtchn); + col = EVTCHN_COL(evtchn); + + if (evtchn_to_irq[row] == NULL) { + /* Unallocated irq entries return -1 anyway */ + if (irq == -1) + return 0; + + evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL); + if (evtchn_to_irq[row] == NULL) + return -ENOMEM; + + clear_evtchn_to_irq_row(row); + } + + evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)] = irq; + return 0; +} + +int get_evtchn_to_irq(unsigned evtchn) +{ + if (evtchn >= xen_evtchn_max_channels()) + return -1; + if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL) + return -1; + return evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)]; +} + /* Get info for IRQ */ struct irq_info *info_for_irq(unsigned irq) { @@ -102,9 +161,10 @@ struct irq_info *info_for_irq(unsigned irq) static int xen_irq_info_common_setup(struct irq_info *info, unsigned irq, enum xen_irq_type type, - unsigned short evtchn, + unsigned evtchn, unsigned short cpu) { + int ret; BUG_ON(info->type != IRQT_UNBOUND && info->type != type); @@ -113,7 +173,9 @@ static int xen_irq_info_common_setup(struct irq_info *info, info->evtchn = evtchn; info->cpu = cpu; - evtchn_to_irq[evtchn] = irq; + ret = set_evtchn_to_irq(evtchn, irq); + if (ret < 0) + return ret; irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN); @@ -121,7 +183,7 @@ static int xen_irq_info_common_setup(struct irq_info *info, } static int xen_irq_info_evtchn_setup(unsigned irq, - unsigned short evtchn) + unsigned evtchn) { struct irq_info *info = info_for_irq(irq); @@ -130,7 +192,7 @@ static int xen_irq_info_evtchn_setup(unsigned irq, static int xen_irq_info_ipi_setup(unsigned cpu, unsigned irq, - unsigned short evtchn, + unsigned evtchn, enum ipi_vector ipi) { struct irq_info *info = info_for_irq(irq); @@ -144,8 +206,8 @@ static int xen_irq_info_ipi_setup(unsigned cpu, static int xen_irq_info_virq_setup(unsigned cpu, unsigned irq, - unsigned short evtchn, - unsigned short virq) + unsigned evtchn, + unsigned virq) { struct irq_info *info = info_for_irq(irq); @@ -157,9 +219,9 @@ static int xen_irq_info_virq_setup(unsigned cpu, } static int xen_irq_info_pirq_setup(unsigned irq, - unsigned short evtchn, - unsigned short pirq, - unsigned short gsi, + unsigned evtchn, + unsigned pirq, + unsigned gsi, uint16_t domid, unsigned char flags) { @@ -173,6 +235,12 @@ static int xen_irq_info_pirq_setup(unsigned irq, return xen_irq_info_common_setup(info, irq, IRQT_PIRQ, evtchn, 0); } +static void xen_irq_info_cleanup(struct irq_info *info) +{ + set_evtchn_to_irq(info->evtchn, -1); + info->evtchn = 0; +} + /* * Accessors for packed IRQ information. */ @@ -186,7 +254,7 @@ unsigned int evtchn_from_irq(unsigned irq) unsigned irq_from_evtchn(unsigned int evtchn) { - return evtchn_to_irq[evtchn]; + return get_evtchn_to_irq(evtchn); } EXPORT_SYMBOL_GPL(irq_from_evtchn); @@ -237,7 +305,7 @@ unsigned cpu_from_irq(unsigned irq) unsigned int cpu_from_evtchn(unsigned int evtchn) { - int irq = evtchn_to_irq[evtchn]; + int irq = get_evtchn_to_irq(evtchn); unsigned ret = 0; if (irq != -1) @@ -263,7 +331,7 @@ static bool pirq_needs_eoi_flag(unsigned irq) static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) { - int irq = evtchn_to_irq[chn]; + int irq = get_evtchn_to_irq(chn); struct irq_info *info = info_for_irq(irq); BUG_ON(irq == -1); @@ -386,6 +454,18 @@ static void xen_free_irq(unsigned irq) irq_free_desc(irq); } +static void xen_evtchn_close(unsigned int port) +{ + struct evtchn_close close; + + close.port = port; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + + /* Closed ports are implicitly re-bound to VCPU0. */ + bind_evtchn_to_cpu(port, 0); +} + static void pirq_query_unmask(int irq) { struct physdev_irq_status_query irq_status; @@ -458,7 +538,13 @@ static unsigned int __startup_pirq(unsigned int irq) pirq_query_unmask(irq); - evtchn_to_irq[evtchn] = irq; + rc = set_evtchn_to_irq(evtchn, irq); + if (rc != 0) { + pr_err("irq%d: Failed to set port to irq mapping (%d)\n", + irq, rc); + xen_evtchn_close(evtchn); + return 0; + } bind_evtchn_to_cpu(evtchn, 0); info->evtchn = evtchn; @@ -476,10 +562,9 @@ static unsigned int startup_pirq(struct irq_data *data) static void shutdown_pirq(struct irq_data *data) { - struct evtchn_close close; unsigned int irq = data->irq; struct irq_info *info = info_for_irq(irq); - int evtchn = evtchn_from_irq(irq); + unsigned evtchn = evtchn_from_irq(irq); BUG_ON(info->type != IRQT_PIRQ); @@ -487,14 +572,8 @@ static void shutdown_pirq(struct irq_data *data) return; mask_evtchn(evtchn); - - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); - - bind_evtchn_to_cpu(evtchn, 0); - evtchn_to_irq[evtchn] = -1; - info->evtchn = 0; + xen_evtchn_close(evtchn); + xen_irq_info_cleanup(info); } static void enable_pirq(struct irq_data *data) @@ -525,7 +604,6 @@ EXPORT_SYMBOL_GPL(xen_irq_from_gsi); static void __unbind_from_irq(unsigned int irq) { - struct evtchn_close close; int evtchn = evtchn_from_irq(irq); struct irq_info *info = irq_get_handler_data(irq); @@ -536,27 +614,22 @@ static void __unbind_from_irq(unsigned int irq) } if (VALID_EVTCHN(evtchn)) { - close.port = evtchn; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); + unsigned int cpu = cpu_from_irq(irq); + + xen_evtchn_close(evtchn); switch (type_from_irq(irq)) { case IRQT_VIRQ: - per_cpu(virq_to_irq, cpu_from_evtchn(evtchn)) - [virq_from_irq(irq)] = -1; + per_cpu(virq_to_irq, cpu)[virq_from_irq(irq)] = -1; break; case IRQT_IPI: - per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn)) - [ipi_from_irq(irq)] = -1; + per_cpu(ipi_to_irq, cpu)[ipi_from_irq(irq)] = -1; break; default: break; } - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; + xen_irq_info_cleanup(info); } BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); @@ -760,9 +833,12 @@ int bind_evtchn_to_irq(unsigned int evtchn) int irq; int ret; + if (evtchn >= xen_evtchn_max_channels()) + return -ENOMEM; + mutex_lock(&irq_mapping_update_lock); - irq = evtchn_to_irq[evtchn]; + irq = get_evtchn_to_irq(evtchn); if (irq == -1) { irq = xen_allocate_irq_dynamic(); @@ -852,7 +928,7 @@ static int find_virq(unsigned int virq, unsigned int cpu) int port, rc = -ENOENT; memset(&status, 0, sizeof(status)); - for (port = 0; port <= NR_EVENT_CHANNELS; port++) { + for (port = 0; port < xen_evtchn_max_channels(); port++) { status.dom = DOMID_SELF; status.port = port; rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); @@ -1022,7 +1098,7 @@ EXPORT_SYMBOL_GPL(unbind_from_irqhandler); int evtchn_make_refcounted(unsigned int evtchn) { - int irq = evtchn_to_irq[evtchn]; + int irq = get_evtchn_to_irq(evtchn); struct irq_info *info; if (irq == -1) @@ -1047,12 +1123,12 @@ int evtchn_get(unsigned int evtchn) struct irq_info *info; int err = -ENOENT; - if (evtchn >= NR_EVENT_CHANNELS) + if (evtchn >= xen_evtchn_max_channels()) return -EINVAL; mutex_lock(&irq_mapping_update_lock); - irq = evtchn_to_irq[evtchn]; + irq = get_evtchn_to_irq(evtchn); if (irq == -1) goto done; @@ -1076,7 +1152,7 @@ EXPORT_SYMBOL_GPL(evtchn_get); void evtchn_put(unsigned int evtchn) { - int irq = evtchn_to_irq[evtchn]; + int irq = get_evtchn_to_irq(evtchn); if (WARN_ON(irq == -1)) return; unbind_from_irq(irq); @@ -1163,7 +1239,7 @@ void rebind_evtchn_irq(int evtchn, int irq) mutex_lock(&irq_mapping_update_lock); /* After resume the irq<->evtchn mappings are all cleared out */ - BUG_ON(evtchn_to_irq[evtchn] != -1); + BUG_ON(get_evtchn_to_irq(evtchn) != -1); /* Expect irq to have been bound before, so there should be a proper type */ BUG_ON(info->type == IRQT_UNBOUND); @@ -1448,15 +1524,14 @@ void xen_irq_resume(void) struct irq_info *info; /* New event-channel space is not 'live' yet. */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) + for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++) mask_evtchn(evtchn); /* No IRQ <-> event-channel mappings. */ list_for_each_entry(info, &xen_irq_list_head, list) info->evtchn = 0; /* zap event-channel binding */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) - evtchn_to_irq[evtchn] = -1; + clear_evtchn_to_irq_all(); for_each_possible_cpu(cpu) { restore_cpu_virqs(cpu); @@ -1553,14 +1628,12 @@ void __init xen_init_IRQ(void) xen_evtchn_2l_init(); - evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), - GFP_KERNEL); + evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()), + sizeof(*evtchn_to_irq), GFP_KERNEL); BUG_ON(!evtchn_to_irq); - for (i = 0; i < NR_EVENT_CHANNELS; i++) - evtchn_to_irq[i] = -1; /* No event channels are 'live' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) + for (i = 0; i < xen_evtchn_nr_channels(); i++) mask_evtchn(i); pirq_needs_eoi = pirq_needs_eoi_flag; diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index dc9650265e04..a3d9aeceda1a 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -35,7 +35,7 @@ struct irq_info { int refcnt; enum xen_irq_type type; /* type */ unsigned irq; - unsigned short evtchn; /* event channel */ + unsigned int evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ union { @@ -55,6 +55,9 @@ struct irq_info { #define PIRQ_SHAREABLE (1 << 1) struct evtchn_ops { + unsigned (*max_channels)(void); + unsigned (*nr_channels)(void); + int (*setup)(struct irq_info *info); void (*bind_to_cpu)(struct irq_info *info, unsigned cpu); @@ -70,12 +73,23 @@ struct evtchn_ops { extern const struct evtchn_ops *evtchn_ops; -extern int *evtchn_to_irq; +extern int **evtchn_to_irq; +int get_evtchn_to_irq(unsigned int evtchn); struct irq_info *info_for_irq(unsigned irq); unsigned cpu_from_irq(unsigned irq); unsigned cpu_from_evtchn(unsigned int evtchn); +static inline unsigned xen_evtchn_max_channels(void) +{ + return evtchn_ops->max_channels(); +} + +static inline unsigned xen_evtchn_nr_channels(void) +{ + return evtchn_ops->nr_channels(); +} + /* * Do any ABI specific setup for a bound event channel before it can * be unmasked and used. -- cgit v1.2.1 From fd21069dfe31a4b20f5ef580006abe72d1660f5b Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 5 Sep 2013 18:11:38 +0100 Subject: xen/events: add xen_evtchn_mask_all() Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_base.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index a6906665de53..c6d64f1e191c 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -344,6 +344,14 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) info->cpu = cpu; } +static void xen_evtchn_mask_all(void) +{ + unsigned int evtchn; + + for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++) + mask_evtchn(evtchn); +} + /** * notify_remote_via_irq - send event to remote end of event channel via irq * @irq: irq of event channel to send event to @@ -1520,12 +1528,11 @@ EXPORT_SYMBOL_GPL(xen_test_irq_shared); void xen_irq_resume(void) { - unsigned int cpu, evtchn; + unsigned int cpu; struct irq_info *info; /* New event-channel space is not 'live' yet. */ - for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++) - mask_evtchn(evtchn); + xen_evtchn_mask_all(); /* No IRQ <-> event-channel mappings. */ list_for_each_entry(info, &xen_irq_list_head, list) @@ -1624,8 +1631,6 @@ void xen_callback_vector(void) {} void __init xen_init_IRQ(void) { - int i; - xen_evtchn_2l_init(); evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()), @@ -1633,8 +1638,7 @@ void __init xen_init_IRQ(void) BUG_ON(!evtchn_to_irq); /* No event channels are 'live' right now. */ - for (i = 0; i < xen_evtchn_nr_channels(); i++) - mask_evtchn(i); + xen_evtchn_mask_all(); pirq_needs_eoi = pirq_needs_eoi_flag; -- cgit v1.2.1 From 0dc0064add422bc0ef5165ebe9ece3052bbd457d Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 23 Sep 2013 21:03:38 +0100 Subject: xen/evtchn: support more than 4096 ports Remove the check during unbind for NR_EVENT_CHANNELS as this limits support to less than 4096 ports. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_base.c | 13 +++++++++++++ drivers/xen/events/events_internal.h | 5 ----- drivers/xen/evtchn.c | 2 +- 3 files changed, 14 insertions(+), 6 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c6d64f1e191c..9d0d88cf74af 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -952,6 +952,19 @@ static int find_virq(unsigned int virq, unsigned int cpu) return rc; } +/** + * xen_evtchn_nr_channels - number of usable event channel ports + * + * This may be less than the maximum supported by the current + * hypervisor ABI. Use xen_evtchn_max_channels() for the maximum + * supported. + */ +unsigned xen_evtchn_nr_channels(void) +{ + return evtchn_ops->nr_channels(); +} +EXPORT_SYMBOL_GPL(xen_evtchn_nr_channels); + int bind_virq_to_irq(unsigned int virq, unsigned int cpu) { struct evtchn_bind_virq bind_virq; diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index a3d9aeceda1a..2862e1cccf1c 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -85,11 +85,6 @@ static inline unsigned xen_evtchn_max_channels(void) return evtchn_ops->max_channels(); } -static inline unsigned xen_evtchn_nr_channels(void) -{ - return evtchn_ops->nr_channels(); -} - /* * Do any ABI specific setup for a bound event channel before it can * be unmasked and used. diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index 5de2063e16d3..00f40f051d95 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -417,7 +417,7 @@ static long evtchn_ioctl(struct file *file, break; rc = -EINVAL; - if (unbind.port >= NR_EVENT_CHANNELS) + if (unbind.port >= xen_evtchn_nr_channels()) break; rc = -ENOTCONN; -- cgit v1.2.1 From bf2bbe07f13846a90d4447521d87566d6f87bc0e Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Fri, 15 Mar 2013 10:55:41 +0000 Subject: xen/events: Add the hypervisor interface for the FIFO-based event channels Add the hypercall sub-ops and the structures for the shared data used in the FIFO-based event channel ABI. The design document for this new ABI is available here: http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf In summary, events are reported using a per-domain shared event array of event words. Each event word has PENDING, LINKED and MASKED bits and a LINK field for pointing to the next event in the event queue. There are 16 event queues (with different priorities) per-VCPU. Key advantages of this new ABI include: - Support for over 100,000 events (2^17). - 16 different event priorities. - Improved fairness in event latency through the use of FIFOs. The ABI is available in Xen 4.4 and later. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_2l.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index ecb402a149e3..d7ff91757307 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -38,12 +38,12 @@ /* Find the first set bit in a evtchn mask */ #define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) -static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD], +static DEFINE_PER_CPU(xen_ulong_t [EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD], cpu_evtchn_mask); static unsigned evtchn_2l_max_channels(void) { - return NR_EVENT_CHANNELS; + return EVTCHN_2L_NR_CHANNELS; } static void evtchn_2l_bind_to_cpu(struct irq_info *info, unsigned cpu) @@ -316,7 +316,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) i % 8 == 0 ? "\n " : " "); printk("\nlocal cpu%d mask:\n ", cpu); - for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) + for (i = (EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--) printk("%0*"PRI_xen_ulong"%s", (int)(sizeof(cpu_evtchn[0])*2), cpu_evtchn[i], i % 8 == 0 ? "\n " : " "); @@ -332,7 +332,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) } printk("\npending list:\n"); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { + for (i = 0; i < EVTCHN_2L_NR_CHANNELS; i++) { if (sync_test_bit(i, BM(sh->evtchn_pending))) { int word_idx = i / BITS_PER_EVTCHN_WORD; printk(" %d: event %d -> irq %d%s%s%s\n", -- cgit v1.2.1 From 6ccecb0fbc0494c7221459e6358a016f3281a0ca Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 23 Sep 2013 12:47:26 +0100 Subject: xen/events: allow event channel priority to be set Add xen_irq_set_priority() to set an event channels priority. This function will only work with event channel ABIs that support priority (i.e., the FIFO-based ABI). Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/events_base.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 9d0d88cf74af..e9001fef4ffd 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1117,6 +1117,23 @@ void unbind_from_irqhandler(unsigned int irq, void *dev_id) } EXPORT_SYMBOL_GPL(unbind_from_irqhandler); +/** + * xen_set_irq_priority() - set an event channel priority. + * @irq:irq bound to an event channel. + * @priority: priority between XEN_IRQ_PRIORITY_MAX and XEN_IRQ_PRIORITY_MIN. + */ +int xen_set_irq_priority(unsigned irq, unsigned priority) +{ + struct evtchn_set_priority set_priority; + + set_priority.port = evtchn_from_irq(irq); + set_priority.priority = priority; + + return HYPERVISOR_event_channel_op(EVTCHNOP_set_priority, + &set_priority); +} +EXPORT_SYMBOL_GPL(xen_set_irq_priority); + int evtchn_make_refcounted(unsigned int evtchn) { int irq = get_evtchn_to_irq(evtchn); -- cgit v1.2.1 From 1fe565517b57676884349dccfd6ce853ec338636 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Fri, 15 Mar 2013 13:02:35 +0000 Subject: xen/events: use the FIFO-based ABI if available Implement all the event channel port ops for the FIFO-based ABI. If the hypervisor supports the FIFO-based ABI, enable it by initializing the control block for the boot VCPU and subsequent VCPUs as they are brought up and on resume. The event array is expanded as required when event ports are setup. The 'xen.fifo_events=0' command line option may be used to disable use of the FIFO-based ABI. Signed-off-by: David Vrabel Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Boris Ostrovsky --- drivers/xen/events/Makefile | 1 + drivers/xen/events/events_base.c | 14 +- drivers/xen/events/events_fifo.c | 426 +++++++++++++++++++++++++++++++++++ drivers/xen/events/events_internal.h | 8 + 4 files changed, 448 insertions(+), 1 deletion(-) create mode 100644 drivers/xen/events/events_fifo.c (limited to 'drivers/xen') diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile index 08179fe04612..62be55cd981d 100644 --- a/drivers/xen/events/Makefile +++ b/drivers/xen/events/Makefile @@ -2,3 +2,4 @@ obj-y += events.o events-y += events_base.o events-y += events_2l.o +events-y += events_fifo.o diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index e9001fef4ffd..1d16185e82b2 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1563,6 +1563,7 @@ void xen_irq_resume(void) /* New event-channel space is not 'live' yet. */ xen_evtchn_mask_all(); + xen_evtchn_resume(); /* No IRQ <-> event-channel mappings. */ list_for_each_entry(info, &xen_irq_list_head, list) @@ -1659,9 +1660,20 @@ void xen_callback_vector(void) void xen_callback_vector(void) {} #endif +#undef MODULE_PARAM_PREFIX +#define MODULE_PARAM_PREFIX "xen." + +static bool fifo_events = true; +module_param(fifo_events, bool, 0); + void __init xen_init_IRQ(void) { - xen_evtchn_2l_init(); + int ret = -EINVAL; + + if (fifo_events) + ret = xen_evtchn_fifo_init(); + if (ret < 0) + xen_evtchn_2l_init(); evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()), sizeof(*evtchn_to_irq), GFP_KERNEL); diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c new file mode 100644 index 000000000000..e2bf9571f7fe --- /dev/null +++ b/drivers/xen/events/events_fifo.c @@ -0,0 +1,426 @@ +/* + * Xen event channels (FIFO-based ABI) + * + * Copyright (C) 2013 Citrix Systems R&D ltd. + * + * This source code is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of the + * License, or (at your option) any later version. + * + * Or, when distributed separately from the Linux kernel or + * incorporated into other software packages, subject to the following + * license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "events_internal.h" + +#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t)) +#define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE) + +struct evtchn_fifo_queue { + uint32_t head[EVTCHN_FIFO_MAX_QUEUES]; +}; + +static DEFINE_PER_CPU(struct evtchn_fifo_control_block *, cpu_control_block); +static DEFINE_PER_CPU(struct evtchn_fifo_queue, cpu_queue); +static event_word_t *event_array[MAX_EVENT_ARRAY_PAGES] __read_mostly; +static unsigned event_array_pages __read_mostly; + +#define BM(w) ((unsigned long *)(w)) + +static inline event_word_t *event_word_from_port(unsigned port) +{ + unsigned i = port / EVENT_WORDS_PER_PAGE; + + return event_array[i] + port % EVENT_WORDS_PER_PAGE; +} + +static unsigned evtchn_fifo_max_channels(void) +{ + return EVTCHN_FIFO_NR_CHANNELS; +} + +static unsigned evtchn_fifo_nr_channels(void) +{ + return event_array_pages * EVENT_WORDS_PER_PAGE; +} + +static void free_unused_array_pages(void) +{ + unsigned i; + + for (i = event_array_pages; i < MAX_EVENT_ARRAY_PAGES; i++) { + if (!event_array[i]) + break; + free_page((unsigned long)event_array[i]); + event_array[i] = NULL; + } +} + +static void init_array_page(event_word_t *array_page) +{ + unsigned i; + + for (i = 0; i < EVENT_WORDS_PER_PAGE; i++) + array_page[i] = 1 << EVTCHN_FIFO_MASKED; +} + +static int evtchn_fifo_setup(struct irq_info *info) +{ + unsigned port = info->evtchn; + unsigned new_array_pages; + int ret = -ENOMEM; + + new_array_pages = port / EVENT_WORDS_PER_PAGE + 1; + + if (new_array_pages > MAX_EVENT_ARRAY_PAGES) + return -EINVAL; + + while (event_array_pages < new_array_pages) { + void *array_page; + struct evtchn_expand_array expand_array; + + /* Might already have a page if we've resumed. */ + array_page = event_array[event_array_pages]; + if (!array_page) { + array_page = (void *)__get_free_page(GFP_KERNEL); + if (array_page == NULL) + goto error; + event_array[event_array_pages] = array_page; + } + + /* Mask all events in this page before adding it. */ + init_array_page(array_page); + + expand_array.array_gfn = virt_to_mfn(array_page); + + ret = HYPERVISOR_event_channel_op(EVTCHNOP_expand_array, &expand_array); + if (ret < 0) + goto error; + + event_array_pages++; + } + return 0; + + error: + if (event_array_pages == 0) + panic("xen: unable to expand event array with initial page (%d)\n", ret); + else + pr_err("unable to expand event array (%d)\n", ret); + free_unused_array_pages(); + return ret; +} + +static void evtchn_fifo_bind_to_cpu(struct irq_info *info, unsigned cpu) +{ + /* no-op */ +} + +static void evtchn_fifo_clear_pending(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + sync_clear_bit(EVTCHN_FIFO_PENDING, BM(word)); +} + +static void evtchn_fifo_set_pending(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + sync_set_bit(EVTCHN_FIFO_PENDING, BM(word)); +} + +static bool evtchn_fifo_is_pending(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + return sync_test_bit(EVTCHN_FIFO_PENDING, BM(word)); +} + +static bool evtchn_fifo_test_and_set_mask(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + return sync_test_and_set_bit(EVTCHN_FIFO_MASKED, BM(word)); +} + +static void evtchn_fifo_mask(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + sync_set_bit(EVTCHN_FIFO_MASKED, BM(word)); +} + +/* + * Clear MASKED, spinning if BUSY is set. + */ +static void clear_masked(volatile event_word_t *word) +{ + event_word_t new, old, w; + + w = *word; + + do { + old = w & ~(1 << EVTCHN_FIFO_BUSY); + new = old & ~(1 << EVTCHN_FIFO_MASKED); + w = sync_cmpxchg(word, old, new); + } while (w != old); +} + +static void evtchn_fifo_unmask(unsigned port) +{ + event_word_t *word = event_word_from_port(port); + + BUG_ON(!irqs_disabled()); + + clear_masked(word); + if (sync_test_bit(EVTCHN_FIFO_PENDING, BM(word))) { + struct evtchn_unmask unmask = { .port = port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); + } +} + +static uint32_t clear_linked(volatile event_word_t *word) +{ + event_word_t new, old, w; + + w = *word; + + do { + old = w; + new = (w & ~((1 << EVTCHN_FIFO_LINKED) + | EVTCHN_FIFO_LINK_MASK)); + } while ((w = sync_cmpxchg(word, old, new)) != old); + + return w & EVTCHN_FIFO_LINK_MASK; +} + +static void handle_irq_for_port(unsigned port) +{ + int irq; + struct irq_desc *desc; + + irq = get_evtchn_to_irq(port); + if (irq != -1) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } +} + +static void consume_one_event(unsigned cpu, + struct evtchn_fifo_control_block *control_block, + unsigned priority, uint32_t *ready) +{ + struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu); + uint32_t head; + unsigned port; + event_word_t *word; + + head = q->head[priority]; + + /* + * Reached the tail last time? Read the new HEAD from the + * control block. + */ + if (head == 0) { + rmb(); /* Ensure word is up-to-date before reading head. */ + head = control_block->head[priority]; + } + + port = head; + word = event_word_from_port(port); + head = clear_linked(word); + + /* + * If the link is non-zero, there are more events in the + * queue, otherwise the queue is empty. + * + * If the queue is empty, clear this priority from our local + * copy of the ready word. + */ + if (head == 0) + clear_bit(priority, BM(ready)); + + if (sync_test_bit(EVTCHN_FIFO_PENDING, BM(word)) + && !sync_test_bit(EVTCHN_FIFO_MASKED, BM(word))) + handle_irq_for_port(port); + + q->head[priority] = head; +} + +static void evtchn_fifo_handle_events(unsigned cpu) +{ + struct evtchn_fifo_control_block *control_block; + uint32_t ready; + unsigned q; + + control_block = per_cpu(cpu_control_block, cpu); + + ready = xchg(&control_block->ready, 0); + + while (ready) { + q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES); + consume_one_event(cpu, control_block, q, &ready); + ready |= xchg(&control_block->ready, 0); + } +} + +static void evtchn_fifo_resume(void) +{ + unsigned cpu; + + for_each_possible_cpu(cpu) { + void *control_block = per_cpu(cpu_control_block, cpu); + struct evtchn_init_control init_control; + int ret; + + if (!control_block) + continue; + + /* + * If this CPU is offline, take the opportunity to + * free the control block while it is not being + * used. + */ + if (!cpu_online(cpu)) { + free_page((unsigned long)control_block); + per_cpu(cpu_control_block, cpu) = NULL; + continue; + } + + init_control.control_gfn = virt_to_mfn(control_block); + init_control.offset = 0; + init_control.vcpu = cpu; + + ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control, + &init_control); + if (ret < 0) + BUG(); + } + + /* + * The event array starts out as empty again and is extended + * as normal when events are bound. The existing pages will + * be reused. + */ + event_array_pages = 0; +} + +static const struct evtchn_ops evtchn_ops_fifo = { + .max_channels = evtchn_fifo_max_channels, + .nr_channels = evtchn_fifo_nr_channels, + .setup = evtchn_fifo_setup, + .bind_to_cpu = evtchn_fifo_bind_to_cpu, + .clear_pending = evtchn_fifo_clear_pending, + .set_pending = evtchn_fifo_set_pending, + .is_pending = evtchn_fifo_is_pending, + .test_and_set_mask = evtchn_fifo_test_and_set_mask, + .mask = evtchn_fifo_mask, + .unmask = evtchn_fifo_unmask, + .handle_events = evtchn_fifo_handle_events, + .resume = evtchn_fifo_resume, +}; + +static int __cpuinit evtchn_fifo_init_control_block(unsigned cpu) +{ + struct page *control_block = NULL; + struct evtchn_init_control init_control; + int ret = -ENOMEM; + + control_block = alloc_page(GFP_KERNEL|__GFP_ZERO); + if (control_block == NULL) + goto error; + + init_control.control_gfn = virt_to_mfn(page_address(control_block)); + init_control.offset = 0; + init_control.vcpu = cpu; + + ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control, &init_control); + if (ret < 0) + goto error; + + per_cpu(cpu_control_block, cpu) = page_address(control_block); + + return 0; + + error: + __free_page(control_block); + return ret; +} + +static int __cpuinit evtchn_fifo_cpu_notification(struct notifier_block *self, + unsigned long action, + void *hcpu) +{ + int cpu = (long)hcpu; + int ret = 0; + + switch (action) { + case CPU_UP_PREPARE: + if (!per_cpu(cpu_control_block, cpu)) + ret = evtchn_fifo_init_control_block(cpu); + break; + default: + break; + } + return ret < 0 ? NOTIFY_BAD : NOTIFY_OK; +} + +static struct notifier_block evtchn_fifo_cpu_notifier __cpuinitdata = { + .notifier_call = evtchn_fifo_cpu_notification, +}; + +int __init xen_evtchn_fifo_init(void) +{ + int cpu = get_cpu(); + int ret; + + ret = evtchn_fifo_init_control_block(cpu); + if (ret < 0) + goto out; + + pr_info("Using FIFO-based ABI\n"); + + evtchn_ops = &evtchn_ops_fifo; + + register_cpu_notifier(&evtchn_fifo_cpu_notifier); +out: + put_cpu(); + return ret; +} diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index 2862e1cccf1c..677f41a0fff9 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -69,6 +69,7 @@ struct evtchn_ops { void (*unmask)(unsigned port); void (*handle_events)(unsigned cpu); + void (*resume)(void); }; extern const struct evtchn_ops *evtchn_ops; @@ -137,6 +138,13 @@ static inline void xen_evtchn_handle_events(unsigned cpu) return evtchn_ops->handle_events(cpu); } +static inline void xen_evtchn_resume(void) +{ + if (evtchn_ops->resume) + evtchn_ops->resume(); +} + void xen_evtchn_2l_init(void); +int xen_evtchn_fifo_init(void); #endif /* #ifndef __EVENTS_INTERNAL_H__ */ -- cgit v1.2.1 From 2771374d47220c7ec271281437625e9519505bb2 Mon Sep 17 00:00:00 2001 From: Mukesh Rathor Date: Wed, 11 Dec 2013 15:36:51 -0500 Subject: xen/pvh: Piggyback on PVHVM for event channels (v2) PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. There is a similar mode - PVHVM where we run in HVM mode with PV code enabled - and this patch explores that. The most notable PV interfaces are the XenBus and event channels. We will piggyback on how the event channel mechanism is used in PVHVM - that is we want the normal native IRQ mechanism and we will install a vector (hvm callback) for which we will call the event channel mechanism. This means that from a pvops perspective, we can use native_irq_ops instead of the Xen PV specific. Albeit in the future we could support pirq_eoi_map. But that is a feature request that can be shared with PVHVM. Signed-off-by: Mukesh Rathor Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel Acked-by: Stefano Stabellini --- drivers/xen/events/events_base.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 1d16185e82b2..4672e003c0ad 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1685,8 +1685,15 @@ void __init xen_init_IRQ(void) pirq_needs_eoi = pirq_needs_eoi_flag; #ifdef CONFIG_X86 - if (xen_hvm_domain()) { + if (xen_pv_domain()) { + irq_ctx_init(smp_processor_id()); + if (xen_initial_domain()) + pci_xen_initial_domain(); + } + if (xen_feature(XENFEAT_hvm_callback_vector)) xen_callback_vector(); + + if (xen_hvm_domain()) { native_init_IRQ(); /* pci_xen_hvm_init must be called after native_init_IRQ so that * __acpi_register_gsi can point at the right function */ @@ -1695,13 +1702,10 @@ void __init xen_init_IRQ(void) int rc; struct physdev_pirq_eoi_gmfn eoi_gmfn; - irq_ctx_init(smp_processor_id()); - if (xen_initial_domain()) - pci_xen_initial_domain(); - pirq_eoi_map = (void *)__get_free_page(GFP_KERNEL|__GFP_ZERO); eoi_gmfn.gmfn = virt_to_mfn(pirq_eoi_map); rc = HYPERVISOR_physdev_op(PHYSDEVOP_pirq_eoi_gmfn_v2, &eoi_gmfn); + /* TODO: No PVH support for PIRQ EOI */ if (rc != 0) { free_page((unsigned long) pirq_eoi_map); pirq_eoi_map = NULL; -- cgit v1.2.1 From 7f256020cc599bc0b736c57d702b864dbbefcefb Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Tue, 31 Dec 2013 15:55:39 -0500 Subject: xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init. The function gnttab_max_grant_frames() returns the maximum amount of frames (pages) of grants we can have. Unfortunatly it was dependent on gnttab_init() having been run before to initialize the boot max value (boot_max_nr_grant_frames). This meant that users of gnttab_max_grant_frames would always get a zero value if they called before gnttab_init() - such as 'platform_pci_init' (drivers/xen/platform-pci.c). Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel Acked-by: Stefano Stabellini --- drivers/xen/grant-table.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index aa846a48f400..99399cb0fd1c 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -62,7 +62,6 @@ static grant_ref_t **gnttab_list; static unsigned int nr_grant_frames; -static unsigned int boot_max_nr_grant_frames; static int gnttab_free_count; static grant_ref_t gnttab_free_head; static DEFINE_SPINLOCK(gnttab_list_lock); @@ -827,6 +826,11 @@ static unsigned int __max_nr_grant_frames(void) unsigned int gnttab_max_grant_frames(void) { unsigned int xen_max = __max_nr_grant_frames(); + static unsigned int boot_max_nr_grant_frames; + + /* First time, initialize it properly. */ + if (!boot_max_nr_grant_frames) + boot_max_nr_grant_frames = __max_nr_grant_frames(); if (xen_max > boot_max_nr_grant_frames) return boot_max_nr_grant_frames; @@ -1227,13 +1231,12 @@ int gnttab_init(void) gnttab_request_version(); nr_grant_frames = 1; - boot_max_nr_grant_frames = __max_nr_grant_frames(); /* Determine the maximum number of frames required for the * grant reference free list on the current hypervisor. */ BUG_ON(grefs_per_grant_frame == 0); - max_nr_glist_frames = (boot_max_nr_grant_frames * + max_nr_glist_frames = (gnttab_max_grant_frames() * grefs_per_grant_frame / RPP); gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *), -- cgit v1.2.1 From 456847533b9ad18baa6685946a2f1e1fa9c05c34 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Tue, 31 Dec 2013 16:33:31 -0500 Subject: xen/grant-table: Refactor gnttab_init We have this odd scenario of where for PV paths we take a shortcut but for the HVM paths we first ioremap xen_hvm_resume_frames, then assign it to gnttab_shared.addr. This is needed because gnttab_map uses gnttab_shared.addr. Instead of having: if (pv) return gnttab_map if (hvm) ... gnttab_map Lets move the HVM part before the gnttab_map and remove the first call to gnttab_map. Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel Acked-by: Stefano Stabellini --- drivers/xen/grant-table.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 99399cb0fd1c..e69c7780c208 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -1173,10 +1173,7 @@ static int gnttab_setup(void) if (max_nr_gframes < nr_grant_frames) return -ENOSYS; - if (xen_pv_domain()) - return gnttab_map(0, nr_grant_frames - 1); - - if (gnttab_shared.addr == NULL) { + if (xen_feature(XENFEAT_auto_translated_physmap) && gnttab_shared.addr == NULL) { gnttab_shared.addr = xen_remap(xen_hvm_resume_frames, PAGE_SIZE * max_nr_gframes); if (gnttab_shared.addr == NULL) { @@ -1185,10 +1182,7 @@ static int gnttab_setup(void) return -ENOMEM; } } - - gnttab_map(0, nr_grant_frames - 1); - - return 0; + return gnttab_map(0, nr_grant_frames - 1); } int gnttab_resume(void) -- cgit v1.2.1 From efaf30a3357872cf0fc7d555b1f9968ec71535d3 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Mon, 6 Jan 2014 10:40:36 -0500 Subject: xen/grant: Implement an grant frame array struct (v3). The 'xen_hvm_resume_frames' used to be an 'unsigned long' and contain the virtual address of the grants. That was OK for most architectures (PVHVM, ARM) were the grants are contiguous in memory. That however is not the case for PVH - in which case we will have to do a lookup for each virtual address for the PFN. Instead of doing that, lets make it a structure which will contain the array of PFNs, the virtual address and the count of said PFNs. Also provide a generic functions: gnttab_setup_auto_xlat_frames and gnttab_free_auto_xlat_frames to populate said structure with appropriate values for PVHVM and ARM. To round it off, change the name from 'xen_hvm_resume_frames' to a more descriptive one - 'xen_auto_xlat_grant_frames'. For PVH, in patch "xen/pvh: Piggyback on PVHVM for grant driver" we will populate the 'xen_auto_xlat_grant_frames' by ourselves. v2 moves the xen_remap in the gnttab_setup_auto_xlat_frames and also introduces xen_unmap for gnttab_free_auto_xlat_frames. Suggested-by: Stefano Stabellini Signed-off-by: Konrad Rzeszutek Wilk [v3: Based on top of 'asm/xen/page.h: remove redundant semicolon'] Acked-by: Stefano Stabellini --- drivers/xen/grant-table.c | 58 ++++++++++++++++++++++++++++++++++++++++------ drivers/xen/platform-pci.c | 10 +++++--- 2 files changed, 58 insertions(+), 10 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index e69c7780c208..44b75ccfbbff 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -65,8 +65,7 @@ static unsigned int nr_grant_frames; static int gnttab_free_count; static grant_ref_t gnttab_free_head; static DEFINE_SPINLOCK(gnttab_list_lock); -unsigned long xen_hvm_resume_frames; -EXPORT_SYMBOL_GPL(xen_hvm_resume_frames); +struct grant_frames xen_auto_xlat_grant_frames; static union { struct grant_entry_v1 *v1; @@ -838,6 +837,51 @@ unsigned int gnttab_max_grant_frames(void) } EXPORT_SYMBOL_GPL(gnttab_max_grant_frames); +int gnttab_setup_auto_xlat_frames(unsigned long addr) +{ + xen_pfn_t *pfn; + unsigned int max_nr_gframes = __max_nr_grant_frames(); + unsigned int i; + void *vaddr; + + if (xen_auto_xlat_grant_frames.count) + return -EINVAL; + + vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes); + if (vaddr == NULL) { + pr_warn("Failed to ioremap gnttab share frames (addr=0x%08lx)!\n", + addr); + return -ENOMEM; + } + pfn = kcalloc(max_nr_gframes, sizeof(pfn[0]), GFP_KERNEL); + if (!pfn) { + xen_unmap(vaddr); + return -ENOMEM; + } + for (i = 0; i < max_nr_gframes; i++) + pfn[i] = PFN_DOWN(addr) + i; + + xen_auto_xlat_grant_frames.vaddr = vaddr; + xen_auto_xlat_grant_frames.pfn = pfn; + xen_auto_xlat_grant_frames.count = max_nr_gframes; + + return 0; +} +EXPORT_SYMBOL_GPL(gnttab_setup_auto_xlat_frames); + +void gnttab_free_auto_xlat_frames(void) +{ + if (!xen_auto_xlat_grant_frames.count) + return; + kfree(xen_auto_xlat_grant_frames.pfn); + xen_unmap(xen_auto_xlat_grant_frames.vaddr); + + xen_auto_xlat_grant_frames.pfn = NULL; + xen_auto_xlat_grant_frames.count = 0; + xen_auto_xlat_grant_frames.vaddr = NULL; +} +EXPORT_SYMBOL_GPL(gnttab_free_auto_xlat_frames); + /* Handling of paged out grant targets (GNTST_eagain) */ #define MAX_DELAY 256 static inline void @@ -1068,6 +1112,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx) struct xen_add_to_physmap xatp; unsigned int i = end_idx; rc = 0; + BUG_ON(xen_auto_xlat_grant_frames.count < nr_gframes); /* * Loop backwards, so that the first hypercall has the largest * index, ensuring that the table will grow only once. @@ -1076,7 +1121,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx) xatp.domid = DOMID_SELF; xatp.idx = i; xatp.space = XENMAPSPACE_grant_table; - xatp.gpfn = (xen_hvm_resume_frames >> PAGE_SHIFT) + i; + xatp.gpfn = xen_auto_xlat_grant_frames.pfn[i]; rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp); if (rc != 0) { pr_warn("grant table add_to_physmap failed, err=%d\n", @@ -1174,11 +1219,10 @@ static int gnttab_setup(void) return -ENOSYS; if (xen_feature(XENFEAT_auto_translated_physmap) && gnttab_shared.addr == NULL) { - gnttab_shared.addr = xen_remap(xen_hvm_resume_frames, - PAGE_SIZE * max_nr_gframes); + gnttab_shared.addr = xen_auto_xlat_grant_frames.vaddr; if (gnttab_shared.addr == NULL) { - pr_warn("Failed to ioremap gnttab share frames (addr=0x%08lx)!\n", - xen_hvm_resume_frames); + pr_warn("gnttab share frames (addr=0x%08lx) is not mapped!\n", + (unsigned long)xen_auto_xlat_grant_frames.vaddr); return -ENOMEM; } } diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c index 2f3528e93cb9..f1947ac218d9 100644 --- a/drivers/xen/platform-pci.c +++ b/drivers/xen/platform-pci.c @@ -108,6 +108,7 @@ static int platform_pci_init(struct pci_dev *pdev, long ioaddr; long mmio_addr, mmio_len; unsigned int max_nr_gframes; + unsigned long grant_frames; if (!xen_domain()) return -ENODEV; @@ -154,13 +155,16 @@ static int platform_pci_init(struct pci_dev *pdev, } max_nr_gframes = gnttab_max_grant_frames(); - xen_hvm_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes); + grant_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes); + if (gnttab_setup_auto_xlat_frames(grant_frames)) + goto out; ret = gnttab_init(); if (ret) - goto out; + goto grant_out; xenbus_probe(NULL); return 0; - +grant_out: + gnttab_free_auto_xlat_frames(); out: pci_release_region(pdev, 0); mem_out: -- cgit v1.2.1 From 6926f6d6109714aab7b26df7099b12555e36676f Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Fri, 3 Jan 2014 10:20:18 -0500 Subject: xen/pvh: Piggyback on PVHVM for grant driver (v4) In PVH the shared grant frame is the PFN and not MFN, hence its mapped via the same code path as HVM. The allocation of the grant frame is done differently - we do not use the early platform-pci driver and have an ioremap area - instead we use balloon memory and stitch all of the non-contingous pages in a virtualized area. That means when we call the hypervisor to replace the GMFN with a XENMAPSPACE_grant_table type, we need to lookup the old PFN for every iteration instead of assuming a flat contingous PFN allocation. Lastly, we only use v1 for grants. This is because PVHVM is not able to use v2 due to no XENMEM_add_to_physmap calls on the error status page (see commit 69e8f430e243d657c2053f097efebc2e2cd559f0 xen/granttable: Disable grant v2 for HVM domains.) Until that is implemented this workaround has to be in place. Also per suggestions by Stefano utilize the PVHVM paths as they share common functionality. v2 of this patch moves most of the PVH code out in the arch/x86/xen/grant-table driver and touches only minimally the generic driver. v3, v4: fixes us some of the code due to earlier patches. Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Stefano Stabellini --- drivers/xen/gntdev.c | 2 +- drivers/xen/grant-table.c | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index e41c79c986ea..073b4a19a8b0 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -846,7 +846,7 @@ static int __init gntdev_init(void) if (!xen_domain()) return -ENODEV; - use_ptemod = xen_pv_domain(); + use_ptemod = !xen_feature(XENFEAT_auto_translated_physmap); err = misc_register(&gntdev_miscdev); if (err != 0) { diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 44b75ccfbbff..1d5fbce4acb7 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -1108,7 +1108,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx) unsigned int nr_gframes = end_idx + 1; int rc; - if (xen_hvm_domain()) { + if (xen_feature(XENFEAT_auto_translated_physmap)) { struct xen_add_to_physmap xatp; unsigned int i = end_idx; rc = 0; @@ -1184,7 +1184,7 @@ static void gnttab_request_version(void) int rc; struct gnttab_set_version gsv; - if (xen_hvm_domain()) + if (xen_feature(XENFEAT_auto_translated_physmap)) gsv.version = 1; else gsv.version = 2; @@ -1327,5 +1327,6 @@ static int __gnttab_init(void) return gnttab_init(); } - -core_initcall(__gnttab_init); +/* Starts after core_initcall so that xen_pvh_gnttab_setup can be called + * beforehand to initialize xen_auto_xlat_grant_frames. */ +core_initcall_sync(__gnttab_init); -- cgit v1.2.1 From be3e9cf33094210a0723fdc841e1abfd0ddc1007 Mon Sep 17 00:00:00 2001 From: Mukesh Rathor Date: Tue, 31 Dec 2013 13:57:35 -0500 Subject: xen/pvh: Piggyback on PVHVM XenBus. PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. For the XenBus mechanism we want to use the PVHVM mechanism. Signed-off-by: Mukesh Rathor Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel Acked-by: Stefano Stabellini --- drivers/xen/xenbus/xenbus_client.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c index ec097d6f964d..01d59e66565d 100644 --- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -45,6 +45,7 @@ #include #include #include +#include #include "xenbus_probe.h" @@ -743,7 +744,7 @@ static const struct xenbus_ring_ops ring_ops_hvm = { void __init xenbus_ring_ops_init(void) { - if (xen_pv_domain()) + if (!xen_feature(XENFEAT_auto_translated_physmap)) ring_ops = &ring_ops_pv; else ring_ops = &ring_ops_hvm; -- cgit v1.2.1 From 11c7ff17c9b6dbf3a4e4f36be30ad531a6cf0ec9 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Mon, 6 Jan 2014 10:44:39 -0500 Subject: xen/grant-table: Force to use v1 of grants. We have the framework to use v2, but there are no backends that actually use it. The end result is that on PV we use v2 grants and on PVHVM v1. The v1 has a capacity of 512 grants per page while the v2 has 256 grants per page. This means we lose about 50% capacity - and if we want more than 16 VIFs (each VIF takes 512 grants), then we are hitting the max per guest of 32. Oracle-bug: 16039922 CC: annie.li@oracle.com CC: msw@amazon.com Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel --- drivers/xen/grant-table.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 1d5fbce4acb7..1ce1c40331f3 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -1184,10 +1184,8 @@ static void gnttab_request_version(void) int rc; struct gnttab_set_version gsv; - if (xen_feature(XENFEAT_auto_translated_physmap)) - gsv.version = 1; - else - gsv.version = 2; + gsv.version = 1; + rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); if (rc == 0 && gsv.version == 2) { grant_table_version = 2; -- cgit v1.2.1 From 89c3cf52c76ff3d7b129156f4b8943af517d9db2 Mon Sep 17 00:00:00 2001 From: Yijing Wang Date: Thu, 5 Dec 2013 19:34:05 +0800 Subject: xen: Use dev_is_pci() to check whether it is pci device Use PCI standard marco dev_is_pci() instead of directly compare pci_bus_type to check whether it is pci device. Signed-off-by: Yijing Wang Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Jan Beulich --- drivers/xen/dbgp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/dbgp.c b/drivers/xen/dbgp.c index f3ccc80a455f..8145a59fd9f6 100644 --- a/drivers/xen/dbgp.c +++ b/drivers/xen/dbgp.c @@ -19,7 +19,7 @@ static int xen_dbgp_op(struct usb_hcd *hcd, int op) dbgp.op = op; #ifdef CONFIG_PCI - if (ctrlr->bus == &pci_bus_type) { + if (dev_is_pci(ctrlr)) { const struct pci_dev *pdev = to_pci_dev(ctrlr); dbgp.u.pci.seg = pci_domain_nr(pdev->bus); -- cgit v1.2.1 From 89b9e08f186a203c250872a663c9eab09cdc583a Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Tue, 7 Jan 2014 21:11:05 +0800 Subject: xen-platform: fix error return code in platform_pci_init() Fix to return a negative error code from the error handling case instead of 0, otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun Signed-off-by: Konrad Rzeszutek Wilk --- drivers/xen/platform-pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'drivers/xen') diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c index f1947ac218d9..a1361c312c06 100644 --- a/drivers/xen/platform-pci.c +++ b/drivers/xen/platform-pci.c @@ -156,7 +156,8 @@ static int platform_pci_init(struct pci_dev *pdev, max_nr_gframes = gnttab_max_grant_frames(); grant_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes); - if (gnttab_setup_auto_xlat_frames(grant_frames)) + ret = gnttab_setup_auto_xlat_frames(grant_frames); + if (ret) goto out; ret = gnttab_init(); if (ret) -- cgit v1.2.1 From be1403b9e66bea9d64db9198256cb27532a870b1 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Tue, 7 Jan 2014 21:11:25 +0800 Subject: xen/evtchn_fifo: fix error return code in evtchn_fifo_setup() Fix to return -ENOMEM from the error handling case instead of 0 (overwrited to 0 by the HYPERVISOR_event_channel_op call), otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: David Vrabel --- drivers/xen/events/events_fifo.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index e2bf9571f7fe..5b2c039f16c5 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -109,7 +109,7 @@ static int evtchn_fifo_setup(struct irq_info *info) { unsigned port = info->evtchn; unsigned new_array_pages; - int ret = -ENOMEM; + int ret; new_array_pages = port / EVENT_WORDS_PER_PAGE + 1; @@ -124,8 +124,10 @@ static int evtchn_fifo_setup(struct irq_info *info) array_page = event_array[event_array_pages]; if (!array_page) { array_page = (void *)__get_free_page(GFP_KERNEL); - if (array_page == NULL) + if (array_page == NULL) { + ret = -ENOMEM; goto error; + } event_array[event_array_pages] = array_page; } -- cgit v1.2.1 From 0db6991dd233396da766076caef71f36b4f96c21 Mon Sep 17 00:00:00 2001 From: Paul Gortmaker Date: Fri, 10 Jan 2014 09:50:08 -0500 Subject: xen: delete new instances of __cpuinit usage Commit 1fe565517b57676884349dccfd6ce853ec338636 ("xen/events: use the FIFO-based ABI if available") added new instances of __cpuinit macro usage. We removed this a couple versions ago; we now want to remove the compat no-op stubs. Introducing new users is not what we want to see at this point in time, as it will break once the stubs are gone. Cc: David Vrabel Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Signed-off-by: Paul Gortmaker Signed-off-by: Konrad Rzeszutek Wilk --- drivers/xen/events/events_fifo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'drivers/xen') diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index 5b2c039f16c5..1de2a191b395 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -359,7 +359,7 @@ static const struct evtchn_ops evtchn_ops_fifo = { .resume = evtchn_fifo_resume, }; -static int __cpuinit evtchn_fifo_init_control_block(unsigned cpu) +static int evtchn_fifo_init_control_block(unsigned cpu) { struct page *control_block = NULL; struct evtchn_init_control init_control; @@ -386,7 +386,7 @@ static int __cpuinit evtchn_fifo_init_control_block(unsigned cpu) return ret; } -static int __cpuinit evtchn_fifo_cpu_notification(struct notifier_block *self, +static int evtchn_fifo_cpu_notification(struct notifier_block *self, unsigned long action, void *hcpu) { @@ -404,7 +404,7 @@ static int __cpuinit evtchn_fifo_cpu_notification(struct notifier_block *self, return ret < 0 ? NOTIFY_BAD : NOTIFY_OK; } -static struct notifier_block evtchn_fifo_cpu_notifier __cpuinitdata = { +static struct notifier_block evtchn_fifo_cpu_notifier = { .notifier_call = evtchn_fifo_cpu_notification, }; -- cgit v1.2.1