<feed xmlns='http://www.w3.org/2005/Atom'>
<title>talos-obmc-linux/kernel/irq, branch dev-4.13</title>
<subtitle>Talos™ II Linux sources for OpenBMC</subtitle>
<id>https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-4.13</id>
<link rel='self' href='https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-4.13'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/'/>
<updated>2017-10-18T07:38:31+00:00</updated>
<entry>
<title>genirq/cpuhotplug: Add sanity check for effective affinity mask</title>
<updated>2017-10-18T07:38:31+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-10-09T10:47:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=709d1ac12dc68dbd48a49a809409e64b1c2f0a9e'/>
<id>urn:sha1:709d1ac12dc68dbd48a49a809409e64b1c2f0a9e</id>
<content type='text'>
commit 60b09c51bb4fb46e2331fdbb39f91520f31d35f7 upstream.

The effective affinity mask handling has no safety net when the mask is not
updated by the interrupt chip or the mask contains offline CPUs.

If that happens the CPU unplug code fails to migrate interrupts.

Add sanity checks and emit a warning when the mask contains only offline
CPUs.

Fixes: 415fcf1a2293 ("genirq/cpuhotplug: Use effective affinity mask")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs</title>
<updated>2017-10-18T07:38:31+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-10-04T19:07:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=c2e2b0db395b7ea655213495240a366c7dbe04f6'/>
<id>urn:sha1:c2e2b0db395b7ea655213495240a366c7dbe04f6</id>
<content type='text'>
commit e43b3b58548051f8809391eb7bec7a27ed3003ea upstream.

Managed interrupts can end up in a stale state on CPU hotplug. If the
interrupt is not targeting a single CPU, i.e. the affinity mask spawns
multiple CPUs then the following can happen:

After boot:

dstate:   0x01601200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  0

After offlining CPU 31 - 24

dstate:   0x01a31000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 24
pending:  0

Now CPU 25 gets onlined again, so it should get the effective interrupt
affinity for this interruopt, but due to the x86 interrupt affinity setter
restrictions this ends up after restarting the interrupt with:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

So the interrupt is still affine to CPU 24, which was the last CPU to go
offline of that affinity set and the move to an online CPU within 24-31,
in this case 25, is pending. This mechanism is x86/ia64 specific as those
architectures cannot move interrupts from thread context and do this when
an interrupt is actually handled. So the move is set to pending.

Whats worse is that offlining CPU 25 again results in:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

This means the interrupt has not been shut down, because the outgoing CPU
is not in the effective affinity mask, but of course nothing notices that
the effective affinity mask is pointing at an offline CPU.

In the case of restarting a managed interrupt the move restriction does not
apply, so the affinity setting can be made unconditional. This needs to be
done _before_ the interrupt is started up as otherwise the condition for
moving it from thread context would not longer be fulfilled.

With that change applied onlining CPU 25 after offlining 31-24 results in:

dstate:   0x01600200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 25
pending:

And after offlining CPU 25:

dstate:   0x01a30000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 25
pending:

which is the correct and expected result.

Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()")
Reported-by: YASUAKI ISHIMATSU &lt;yasu.isimatu@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: axboe@kernel.dk
Cc: linux-scsi@vger.kernel.org
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: mpe@ellerman.id.au
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: keith.busch@intel.com
Cc: peterz@infradead.org
Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>irq/generic-chip: Don't replace domain's name</title>
<updated>2017-10-05T07:47:34+00:00</updated>
<author>
<name>Jeffy Chen</name>
<email>jeffy.chen@rock-chips.com</email>
</author>
<published>2017-09-28T04:37:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=2cfa35c2f21437421d6afeda3be7d7341d2b15af'/>
<id>urn:sha1:2cfa35c2f21437421d6afeda3be7d7341d2b15af</id>
<content type='text'>
commit 72364d320644c12948786962673772f271039a4a upstream.

When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.

Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.

Remove the name assignment to prevent this.

Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen &lt;jeffy.chen@rock-chips.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq: Fix cpumask check in __irq_startup_managed()</title>
<updated>2017-10-05T07:47:26+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-09-13T21:29:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=bbbbdfcb53297a7bce3d328f393710f413b39d47'/>
<id>urn:sha1:bbbbdfcb53297a7bce3d328f393710f413b39d47</id>
<content type='text'>
commit 9cb067ef8a10bb13112e4d1c0ea996ec96527422 upstream.

The result of cpumask_any_and() is invalid when result greater or equal
nr_cpu_ids. The current check is checking for greater only. Fix it.

Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Chen Yu &lt;yu.c.chen@intel.com&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Alok Kataria &lt;akataria@vmware.com&gt;
Cc: Joerg Roedel &lt;joro@8bytes.org&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Rui Zhang &lt;rui.zhang@intel.com&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Link: http://lkml.kernel.org/r/20170913213152.272283444@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq/msi: Fix populating multiple interrupts</title>
<updated>2017-10-05T07:47:26+00:00</updated>
<author>
<name>John Keeping</name>
<email>john@metanate.com</email>
</author>
<published>2017-09-06T09:35:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=c43ceff9ed70be01e82e763415234ee87b5fa5d4'/>
<id>urn:sha1:c43ceff9ed70be01e82e763415234ee87b5fa5d4</id>
<content type='text'>
commit 596a7a1d0989c621c3ae49be73a1d1f9de22eb5a upstream.

On allocating the interrupts routed via a wire-to-MSI bridge, the allocator
iterates over the MSI descriptors to build the hierarchy, but fails to use
the descriptor interrupt number, and instead uses the base number,
generating the wrong IRQ domain mappings.

The fix is to use the MSI descriptor interrupt number when setting up
the interrupt instead of the base interrupt for the allocation range.

The only saving grace is that although the MSI descriptors are allocated
in bulk, the wired interrupts are only allocated one by one (so
desc-&gt;irq == virq) and the bug went unnoticed so far.

Fixes: 2145ac9310b60 ("genirq/msi: Add msi_domain_populate_irqs")
Signed-off-by: John Keeping &lt;john@metanate.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Link: http://lkml.kernel.org/r/20170906103540.373864a2.john@metanate.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq: Make sparse_irq_lock protect what it should protect</title>
<updated>2017-10-05T07:47:25+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-09-05T08:12:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=693057fcf9c6ec84dd4cce032485c502e1485c6f'/>
<id>urn:sha1:693057fcf9c6ec84dd4cce032485c502e1485c6f</id>
<content type='text'>
commit 12ac1d0f6c3e95732d144ffa65c8b20fbd9aa462 upstream.

for_each_active_irq() iterates the sparse irq allocation bitmap. The caller
must hold sparse_irq_lock. Several code pathes expect that an active bit in
the sparse bitmap also has a valid interrupt descriptor.

Unfortunately that's not true. The (de)allocation is a two step process,
which holds the sparse_irq_lock only across the queue/remove from the radix
tree and the set/clear in the allocation bitmap.

If a iteration locks sparse_irq_lock between the two steps, then it might
see an active bit but the corresponding irq descriptor is NULL. If that is
dereferenced unconditionally, then the kernel oopses. Of course, all
iterator sites could be audited and fixed, but....

There is no reason why the sparse_irq_lock needs to be dropped between the
two steps, in fact the code becomes simpler when the mutex is held across
both and the semantics become more straight forward, so future problems of
missing NULL pointer checks in the iteration are avoided and all existing
sites are fixed in one go.

Expand the lock held sections so both operations are covered and the bitmap
and the radixtree are in sync.

Fixes: a05a900a51c7 ("genirq: Make sparse_lock a mutex")
Reported-and-tested-by: Huang Ying &lt;ying.huang@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq/ipi: Fixup checks against nr_cpu_ids</title>
<updated>2017-08-20T08:49:05+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2017-08-19T09:57:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=8fbbe2d7cc478d1544f41f2271787c993c23a4f6'/>
<id>urn:sha1:8fbbe2d7cc478d1544f41f2271787c993c23a4f6</id>
<content type='text'>
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.

Fixes: 3b8e29a82dd1 ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae2a ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2

</content>
</entry>
<entry>
<title>genirq: Restore trigger settings in irq_modify_status()</title>
<updated>2017-08-18T10:04:14+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2017-08-18T09:53:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=e8f241893dfbbebe2813c01eac54f263e6a5e59c'/>
<id>urn:sha1:e8f241893dfbbebe2813c01eac54f263e6a5e59c</id>
<content type='text'>
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.

That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.

Fixes: 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy &lt;jeffy.chen@rock-chips.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com

</content>
</entry>
<entry>
<title>genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"</title>
<updated>2017-07-27T13:40:02+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-07-27T10:21:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=8397913303abc9333f376a518a8368fa22ca5e6e'/>
<id>urn:sha1:8397913303abc9333f376a518a8368fa22ca5e6e</id>
<content type='text'>
That commit was part of the changes moving x86 to the generic CPU hotplug
interrupt migration code. The force flag was required on x86 before the
hierarchical irqdomain rework, but invoking set_affinity() with force=true
stayed and had no side effects.

At some point in the past, the force flag got repurposed to support the
exynos timer interrupt affinity setting to a not yet online CPU, so the
interrupt controller callback does not verify the supplied affinity mask
against cpu_online_mask.

Setting the flag in the CPU hotplug code causes the cpu online masking to
be blocked on these irq controllers and results in potentially affining an
interrupt to the CPU which is unplugged, i.e. instead of moving it away,
it's just reassigned to it.

As the force flags is not longer needed on x86, it's safe to revert that
patch so the ARM irqchips which use the force flag work again.

Add comments to that effect, so this won't happen again.

Note: The online mask handling should be done in the generic code and the
force flag and the masking in the irq chips removed all together, but
that's not a change possible for 4.13. 

Fixes: 77f85e66aa8b ("genirq/cpuhotplug: Set force affinity flag on hotplug migration")
Reported-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: LAK &lt;linux-arm-kernel@lists.infradead.org&gt;
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707271217590.3109@nanos
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>genirq/PM: Properly pretend disabled state when force resuming interrupts</title>
<updated>2017-07-17T20:32:20+00:00</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2017-07-17T17:47:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=a696712c3dd54eb58d2c5a807b4aaa27782d80d6'/>
<id>urn:sha1:a696712c3dd54eb58d2c5a807b4aaa27782d80d6</id>
<content type='text'>
Interrupts with the IRQF_FORCE_RESUME flag set have also the
IRQF_NO_SUSPEND flag set. They are not disabled in the suspend path, but
must be forcefully resumed. That's used by XEN to keep IPIs enabled beyond
the suspension of device irqs. Force resume works by pretending that the
interrupt was disabled and then calling __irq_enable().

Incrementing the disabled depth counter was enough to do that, but with the
recent changes which use state flags to avoid unnecessary hardware access,
this is not longer sufficient. If the state flags are not set, then the
hardware callbacks are not invoked and the interrupt line stays disabled in
"hardware".

Set the disabled and masked state when pretending that an interrupt got
disabled by suspend.

Fixes: bf22ff45bed6 ("genirq: Avoid unnecessary low level irq function calls")
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/20170717174703.4603-2-jgross@suse.com
</content>
</entry>
</feed>
