<feed xmlns='http://www.w3.org/2005/Atom'>
<title>blackbird-op-linux/arch/x86/kernel, branch master</title>
<subtitle>Blackbird™ Linux sources for OpenPOWER</subtitle>
<id>https://git.raptorcs.com/git/blackbird-op-linux/atom?h=master</id>
<link rel='self' href='https://git.raptorcs.com/git/blackbird-op-linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/'/>
<updated>2020-02-09T20:11:12+00:00</updated>
<entry>
<title>Merge tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-02-09T20:11:12+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-02-09T20:11:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=1a2a76c2685a29e46d7b37e752ccea7b15aa8e24'/>
<id>urn:sha1:1a2a76c2685a29e46d7b37e752ccea7b15aa8e24</id>
<content type='text'>
Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Ensure that the PIT is set up when the local APIC is disable or
     configured in legacy mode. This is caused by an ordering issue
     introduced in the recent changes which skip PIT initialization when
     the TSC and APIC frequencies are already known.

   - Handle malformed SRAT tables during early ACPI parsing which caused
     an infinite loop anda boot hang.

   - Fix a long standing race in the affinity setting code which affects
     PCI devices with non-maskable MSI interrupts. The problem is caused
     by the non-atomic writes of the MSI address (destination APIC id)
     and data (vector) fields which the device uses to construct the MSI
     message. The non-atomic writes are mandated by PCI.

     If both fields change and the device raises an interrupt after
     writing address and before writing data, then the MSI block
     constructs a inconsistent message which causes interrupts to be
     lost and subsequent malfunction of the device.

     The fix is to redirect the interrupt to the new vector on the
     current CPU first and then switch it over to the new target CPU.
     This allows to observe an eventually raised interrupt in the
     transitional stage (old CPU, new vector) to be observed in the APIC
     IRR and retriggered on the new target CPU and the new vector.

     The potential spurious interrupts caused by this are harmless and
     can in the worst case expose a buggy driver (all handlers have to
     be able to deal with spurious interrupts as they can and do happen
     for various reasons).

   - Add the missing suspend/resume mechanism for the HYPERV hypercall
     page which prevents resume hibernation on HYPERV guests. This
     change got lost before the merge window.

   - Mask the IOAPIC before disabling the local APIC to prevent
     potentially stale IOAPIC remote IRR bits which cause stale
     interrupt lines after resume"

* tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Mask IOAPIC entries when disabling the local APIC
  x86/hyperv: Suspend/resume the hypercall page for hibernation
  x86/apic/msi: Plug non-maskable MSI affinity race
  x86/boot: Handle malformed SRAT tables during early ACPI parsing
  x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
</content>
</entry>
<entry>
<title>Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2020-02-08T21:26:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-02-08T21:26:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=c9d35ee049b40f1d73e890bf88dd55f83b1e9be8'/>
<id>urn:sha1:c9d35ee049b40f1d73e890bf88dd55f83b1e9be8</id>
<content type='text'>
Pull vfs file system parameter updates from Al Viro:
 "Saner fs_parser.c guts and data structures. The system-wide registry
  of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
  the horror switch() in fs_parse() that would have to grow another case
  every time something got added to that system-wide registry.

  New syntax types can be added by filesystems easily now, and their
  namespace is that of functions - not of system-wide enum members. IOW,
  they can be shared or kept private and if some turn out to be widely
  useful, we can make them common library helpers, etc., without having
  to do anything whatsoever to fs_parse() itself.

  And we already get that kind of requests - the thing that finally
  pushed me into doing that was "oh, and let's add one for timeouts -
  things like 15s or 2h". If some filesystem really wants that, let them
  do it. Without somebody having to play gatekeeper for the variants
  blessed by direct support in fs_parse(), TYVM.

  Quite a bit of boilerplate is gone. And IMO the data structures make a
  lot more sense now. -200LoC, while we are at it"

* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
  tmpfs: switch to use of invalfc()
  cgroup1: switch to use of errorfc() et.al.
  procfs: switch to use of invalfc()
  hugetlbfs: switch to use of invalfc()
  cramfs: switch to use of errofc() et.al.
  gfs2: switch to use of errorfc() et.al.
  fuse: switch to use errorfc() et.al.
  ceph: use errorfc() and friends instead of spelling the prefix out
  prefix-handling analogues of errorf() and friends
  turn fs_param_is_... into functions
  fs_parse: handle optional arguments sanely
  fs_parse: fold fs_parameter_desc/fs_parameter_spec
  fs_parser: remove fs_parameter_description name field
  add prefix to fs_context-&gt;log
  ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
  new primitive: __fs_parse()
  switch rbd and libceph to p_log-based primitives
  struct p_log, variants of warnf() et.al. taking that one instead
  teach logfc() to handle prefices, give it saner calling conventions
  get rid of cg_invalf()
  ...
</content>
</entry>
<entry>
<title>fs_parse: fold fs_parameter_desc/fs_parameter_spec</title>
<updated>2020-02-07T19:48:37+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2019-09-07T11:23:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=d7167b149943e38ad610191ecbb0800c78bbced9'/>
<id>urn:sha1:d7167b149943e38ad610191ecbb0800c78bbced9</id>
<content type='text'>
The former contains nothing but a pointer to an array of the latter...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs_parser: remove fs_parameter_description name field</title>
<updated>2020-02-07T19:48:36+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@sandeen.net</email>
</author>
<published>2019-12-06T16:45:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=96cafb9ccb153f6a82ff2c9bde68916d9d65501e'/>
<id>urn:sha1:96cafb9ccb153f6a82ff2c9bde68916d9d65501e</id>
<content type='text'>
Unused now.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>x86/apic: Mask IOAPIC entries when disabling the local APIC</title>
<updated>2020-02-07T14:32:16+00:00</updated>
<author>
<name>Tony W Wang-oc</name>
<email>TonyWWang-oc@zhaoxin.com</email>
</author>
<published>2020-01-15T08:22:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=0f378d73d429d5f73fe2f00be4c9a15dbe9779ee'/>
<id>urn:sha1:0f378d73d429d5f73fe2f00be4c9a15dbe9779ee</id>
<content type='text'>
When a system suspends, the local APIC is disabled in the suspend sequence,
but the IOAPIC is left in the current state. This means unmasked interrupt
lines stay unmasked. This is usually the case for IOAPIC pin 9 to which the
ACPI interrupt is connected.

That means that in suspended state the IOAPIC can respond to an external
interrupt, e.g. the wakeup via keyboard/RTC/ACPI, but the interrupt message
cannot be handled by the disabled local APIC. As a consequence the Remote
IRR bit is set, but the local APIC does not send an EOI to acknowledge
it. This causes the affected interrupt line to become stale and the stale
Remote IRR bit will cause a hang when __synchronize_hardirq() is invoked
for that interrupt line.

To prevent this, mask all IOAPIC entries before disabling the local
APIC. The resume code already has the unmask operation inside.

[ tglx: Massaged changelog ]

Signed-off-by: Tony W Wang-oc &lt;TonyWWang-oc@zhaoxin.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/1579076539-7267-1-git-send-email-TonyWWang-oc@zhaoxin.com

</content>
</entry>
<entry>
<title>Merge tag 'kvm-5.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2020-02-06T17:07:45+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-02-06T17:07:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=90568ecf561540fa330511e21fcd823b0c3829c6'/>
<id>urn:sha1:90568ecf561540fa330511e21fcd823b0c3829c6</id>
<content type='text'>
Pull more KVM updates from Paolo Bonzini:
 "s390:
   - fix register corruption
   - ENOTSUPP/EOPNOTSUPP mixed
   - reset cleanups/fixes
   - selftests

  x86:
   - Bug fixes and cleanups
   - AMD support for APIC virtualization even in combination with
     in-kernel PIT or IOAPIC.

  MIPS:
   - Compilation fix.

  Generic:
   - Fix refcount overflow for zero page"

* tag 'kvm-5.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
  KVM: vmx: delete meaningless vmx_decache_cr0_guest_bits() declaration
  KVM: x86: Mark CR4.UMIP as reserved based on associated CPUID bit
  x86: vmxfeatures: rename features for consistency with KVM and manual
  KVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses
  KVM: x86: Fix perfctr WRMSR for running counters
  x86/kvm/hyper-v: don't allow to turn on unsupported VMX controls for nested guests
  x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs()
  kvm: mmu: Separate generating and setting mmio ptes
  kvm: mmu: Replace unsigned with unsigned int for PTE access
  KVM: nVMX: Remove stale comment from nested_vmx_load_cr3()
  KVM: MIPS: Fold comparecount_func() into comparecount_wakeup()
  KVM: MIPS: Fix a build error due to referencing not-yet-defined function
  x86/kvm: do not setup pv tlb flush when not paravirtualized
  KVM: fix overflow of zero page refcount with ksm running
  KVM: x86: Take a u64 when checking for a valid dr7 value
  KVM: x86: use raw clock values consistently
  KVM: x86: reorganize pvclock_gtod_data members
  KVM: nVMX: delete meaningless nested_vmx_run() declaration
  KVM: SVM: allow AVIC without split irqchip
  kvm: ioapic: Lazy update IOAPIC EOI
  ...
</content>
</entry>
<entry>
<title>x86/kvm: do not setup pv tlb flush when not paravirtualized</title>
<updated>2020-02-05T14:28:07+00:00</updated>
<author>
<name>Thadeu Lima de Souza Cascardo</name>
<email>cascardo@canonical.com</email>
</author>
<published>2020-01-31T15:56:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=64b38bd1906bb62a040b4e91815e56005db4784d'/>
<id>urn:sha1:64b38bd1906bb62a040b4e91815e56005db4784d</id>
<content type='text'>
kvm_setup_pv_tlb_flush will waste memory and print a misguiding message
when KVM paravirtualization is not available.

Intel SDM says that the when cpuid is used with EAX higher than the
maximum supported value for basic of extended function, the data for the
highest supported basic function will be returned.

So, in some systems, kvm_arch_para_features will return bogus data,
causing kvm_setup_pv_tlb_flush to detect support for pv tlb flush.

Testing for kvm_para_available will work as it checks for the hypervisor
signature.

Besides, when the "nopv" command line parameter is used, it should not
continue as well, as kvm_guest_init will no be called in that case.

Signed-off-by: Thadeu Lima de Souza Cascardo &lt;cascardo@canonical.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>proc: convert everything to "struct proc_ops"</title>
<updated>2020-02-04T03:05:26+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2020-02-04T01:37:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=97a32539b9568bb653683349e5a76d02ff3c3e2c'/>
<id>urn:sha1:97a32539b9568bb653683349e5a76d02ff3c3e2c</id>
<content type='text'>
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

	llseek		=&gt; proc_lseek
	unlocked_ioctl	=&gt; proc_ioctl

	xxx		=&gt; proc_xxx

	delete ".owner = THIS_MODULE" line

[akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
[sfr@canb.auug.org.au: fix kernel/sched/psi.c]
  Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86/apic/msi: Plug non-maskable MSI affinity race</title>
<updated>2020-02-01T08:31:47+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2020-01-31T14:26:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=6f1a4891a5928a5969c87fa5a584844c983ec823'/>
<id>urn:sha1:6f1a4891a5928a5969c87fa5a584844c983ec823</id>
<content type='text'>
Evan tracked down a subtle race between the update of the MSI message and
the device raising an interrupt internally on PCI devices which do not
support MSI masking. The update of the MSI message is non-atomic and
consists of either 2 or 3 sequential 32bit wide writes to the PCI config
space.

   - Write address low 32bits
   - Write address high 32bits (If supported by device)
   - Write data

When an interrupt is migrated then both address and data might change, so
the kernel attempts to mask the MSI interrupt first. But for MSI masking is
optional, so there exist devices which do not provide it. That means that
if the device raises an interrupt internally between the writes then a MSI
message is sent built from half updated state.

On x86 this can lead to spurious interrupts on the wrong interrupt
vector when the affinity setting changes both address and data. As a
consequence the device interrupt can be lost causing the device to
become stuck or malfunctioning.

Evan tried to handle that by disabling MSI accross an MSI message
update. That's not feasible because disabling MSI has issues on its own:

 If MSI is disabled the PCI device is routing an interrupt to the legacy
 INTx mechanism. The INTx delivery can be disabled, but the disablement is
 not working on all devices.

 Some devices lose interrupts when both MSI and INTx delivery are disabled.

Another way to solve this would be to enforce the allocation of the same
vector on all CPUs in the system for this kind of screwed devices. That
could be done, but it would bring back the vector space exhaustion problems
which got solved a few years ago.

Fortunately the high address (if supported by the device) is only relevant
when X2APIC is enabled which implies interrupt remapping. In the interrupt
remapping case the affinity setting is happening at the interrupt remapping
unit and the PCI MSI message is programmed only once when the PCI device is
initialized.

That makes it possible to solve it with a two step update:

  1) Target the MSI msg to the new vector on the current target CPU

  2) Target the MSI msg to the new vector on the new target CPU

In both cases writing the MSI message is only changing a single 32bit word
which prevents the issue of inconsistency.

After writing the final destination it is necessary to check whether the
device issued an interrupt while the intermediate state #1 (new vector,
current CPU) was in effect.

This is possible because the affinity change is always happening on the
current target CPU. The code runs with interrupts disabled, so the
interrupt can be detected by checking the IRR of the local APIC. If the
vector is pending in the IRR then the interrupt is retriggered on the new
target CPU by sending an IPI for the associated vector on the target CPU.

This can cause spurious interrupts on both the local and the new target
CPU.

 1) If the new vector is not in use on the local CPU and the device
    affected by the affinity change raised an interrupt during the
    transitional state (step #1 above) then interrupt entry code will
    ignore that spurious interrupt. The vector is marked so that the
    'No irq handler for vector' warning is supressed once.

 2) If the new vector is in use already on the local CPU then the IRR check
    might see an pending interrupt from the device which is using this
    vector. The IPI to the new target CPU will then invoke the handler of
    the device, which got the affinity change, even if that device did not
    issue an interrupt

 3) If the new vector is in use already on the local CPU and the device
    affected by the affinity change raised an interrupt during the
    transitional state (step #1 above) then the handler of the device which
    uses that vector on the local CPU will be invoked.

expose issues in device driver interrupt handlers which are not prepared to
handle a spurious interrupt correctly. This not a regression, it's just
exposing something which was already broken as spurious interrupts can
happen for a lot of reasons and all driver handlers need to be able to deal
with them.

Reported-by: Evan Green &lt;evgreen@chromium.org&gt;
Debugged-by: Evan Green &lt;evgreen@chromium.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Evan Green &lt;evgreen@chromium.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de
</content>
</entry>
<entry>
<title>Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-01-31T19:05:33+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-01-31T19:05:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=b70a2d6b29f7c5b621bf83f903f26fee5fe28efc'/>
<id>urn:sha1:b70a2d6b29f7c5b621bf83f903f26fee5fe28efc</id>
<content type='text'>
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - three fixes and a cleanup for the resctrl code

   - a HyperV fix

   - a fix to /proc/kcore contents in live debugging sessions

   - a fix for the x86 decoder opcode map"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/decoder: Add TEST opcode to Group3-2
  x86/resctrl: Clean up unused function parameter in mkdir path
  x86/resctrl: Fix a deadlock due to inaccurate reference
  x86/resctrl: Fix use-after-free due to inaccurate refcount of rdtgroup
  x86/resctrl: Fix use-after-free when deleting resource groups
  x86/hyper-v: Add "polling" bit to hv_synic_sint
  x86/crash: Define arch_crash_save_vmcoreinfo() if CONFIG_CRASH_CORE=y
</content>
</entry>
</feed>
