| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
If phy_power_on() fails in rk_gmac_powerup(), clocks are left enabled.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Peng Li says:
====================
net: hns: code optimizations & bugfixes for HNS driver
This patchset includes bugfixes and code optimizations for the HNS
ethernet controller driver
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When reading phy registers via Clause 45 MDIO protocol, after write
address operation, the driver use another write address operation, so
can not read the right value of any phy registers. This patch fixes it.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The hns driver of earlier devices, when autoneg off, restart autoneg
will return -EINVAL, so make the hns driver for the latest devices
do the same.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In hns enet driver, we use of_parse_handle() to get hold of the
device node related to "ae-handle" but we have missed to put
the node reference using of_node_put() after we are done using
the node. This patch fixes it.
Note:
This problem is stated in Link: https://lkml.org/lkml/2018/12/22/217
Fixes: 48189d6aaf1e ("net: hns: enet specifies a reference to dsaf")
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-01-25
This series introduces some fixes to mlx5 driver.
For more information please see tag log below.
Please pull and let me know if there is any problem.
For -stable v4.13
('net/mlx5e: Allow MAC invalidation while spoofchk is ON')
For -stable v4.18
('Revert "net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager"')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It turns out that libvirt uses 0-vid as a default if no vlan was
set for the guest (which is the case for switchdev mode) and errs
if we disallow that:
error: Failed to start domain vm75
error: Cannot set interface MAC/vlanid to 6a:66:2d:48:92:c2/0 \
for ifname enp59s0f0 vf 0: Operation not supported
So allow this in order not to break existing systems.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Gavi Teitz <gavi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With VF LAG commit 491c37e49b48 "net/mlx5e: In case of LAG, one switch
parent id is used for all representors", both uplinks and all the VFs
(on both of them) get the same switchdev id.
This cause the provisioning system method to identify the rep of a given
VF from the parent PF PCI device using switchev id and physical port
name to break, since VFm of PF0 will have the (id, name) as VFm of PF1.
To fix that, we align to use the framework agreed upstream and set by
nfp commit 168c478e107e "nfp: wire get_phys_port_name on representors":
$ cat /sys/class/net/eth4_*/phys_port_name
p0
pf0vf0
pf0vf1
Now, the names will be different, e.g. pf0vf0 vs. pf1vf0.
Fixes: 491c37e49b48 ("net/mlx5e: In case of LAG, one switch parent id is used for all representors")
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Waleed Musa <waleedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prior to this patch the driver prohibited spoof checking on invalid MAC.
Now the user can set this configuration if it wishes to.
This is required since libvirt might invalidate the VF Mac by setting it
to zero, while spoofcheck is ON.
Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The lock in qp_table might be taken from process context or from
interrupt context. This may lead to a deadlock unless it is taken with
IRQs disabled.
Discovered by lockdep
================================
WARNING: inconsistent lock state
4.20.0-rc6
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W}
python/12572 [HC1[1]:SC0[0]:HE0:SE1] takes:
00000000052a4df4 (&(&table->lock)->rlock#2){?.+.}, /0x50 [mlx5_core]
{HARDIRQ-ON-W} state was registered at:
_raw_spin_lock+0x33/0x70
mlx5_get_rsc+0x1a/0x50 [mlx5_core]
mlx5_ib_eqe_pf_action+0x493/0x1be0 [mlx5_ib]
process_one_work+0x90c/0x1820
worker_thread+0x87/0xbb0
kthread+0x320/0x3e0
ret_from_fork+0x24/0x30
irq event stamp: 103928
hardirqs last enabled at (103927): [] nk+0x1a/0x1c
hardirqs last disabled at (103928): [] unk+0x1a/0x1c
softirqs last enabled at (103924): [] tcp_sendmsg+0x31/0x40
softirqs last disabled at (103922): [] 80
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&table->lock)->rlock#2);
lock(&(&table->lock)->rlock#2);
*** DEADLOCK ***
Fixes: 032080ab43ac ("IB/mlx5: Lock QP during page fault handling")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
MLX5E_PFLAG_* definitions were changed from bitmask to enumerated
values. However, in mlx5e_open_rq(), the proper API (MLX5E_GET_PFLAG macro)
was not used to read the flag value of MLX5E_PFLAG_RX_NO_CSUM_COMPLETE.
Fixed it.
Fixes: 8ff57c18e9f6 ("net/mlx5e: Improve ethtool private-flags code structure")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commit 5f5991f36dce1e69dd8bd7495763eec2e28f08e7.
With the original commit, eswitch instance will not be initialized for
a function which is vport group manager but not eswitch manager such as
host PF on SmartNIC (BlueField) card. This will result in a kernel crash
when such a vport group manager is trying to access vports in its group.
E.g, PF vport manager (not eswitch manager) tries to configure the MAC
of its VF vport, a kernel trace will happen similar as bellow:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
...
RIP: 0010:mlx5_eswitch_get_vport_config+0xc/0x180 [mlx5_core]
...
Fixes: 5f5991f36dce ("net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager")
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reported-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When an internally generated frame is handled by rose_xmit(),
rose_route_frame() is called:
if (!rose_route_frame(skb, NULL)) {
dev_kfree_skb(skb);
stats->tx_errors++;
return NETDEV_TX_OK;
}
We have the same code sequence in Net/Rom where an internally generated
frame is handled by nr_xmit() calling nr_route_frame(skb, NULL).
However, in this function NULL argument is tested while it is not in
rose_route_frame().
Then kernel panic occurs later on when calling ax25cmp() with a NULL
ax25_cb argument as reported many times and recently with syzbot.
We need to test if ax25 is NULL before using it.
Testing:
Built kernel with CONFIG_ROSE=y.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+1a2c456a1ea08fa5b5f7@syzkaller.appspotmail.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bernard Pidoux <f6bvp@free.fr>
Cc: linux-hams@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If fill_level was not zero and status was not BUSY,
result of "tx_prod - tx_cons - inuse" might be zero.
Subtracting 1 unconditionally results invalid negative return value
on this case.
Make sure not to return an negative value.
Signed-off-by: Tomonori Sakita <tomonori.sakita@sord.co.jp>
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Reviewed-by: Dalon L Westergreen <dalon.westergreen@linux.intel.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
sk_reset_timer() and sk_stop_timer() properly handle
sock refcnt for timer function. Switching to them
could fix a refcounting bug reported by syzbot.
Reported-and-tested-by: syzbot+defa700d16f1bd1b9a05@syzkaller.appspotmail.com
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-hams@vger.kernel.org
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2019-01-25
1) Several patches to fix the fallout from the recent
tree based policy lookup work. From Florian Westphal.
2) Fix VTI for IPCOMP for 'not compressed' IPCOMP packets.
We need an extra IPIP handler to process these packets
correctly. From Su Yanjun.
3) Fix validation of template and selector families for
MODE_ROUTEOPTIMIZATION with ipv4-in-ipv6 packets.
This can lead to a stack-out-of-bounds because
flowi4 struct is treated as flowi6 struct.
Fix from Florian Westphal.
4) Restore the default behaviour of the xfrm set-mark
in the output path. This was changed accidentally
when mark setting was extended to the input path.
From Benedict Wong.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes 9b42c1f179a6, which changed the default route lookup behavior for
tunnel mode SAs in the outbound direction to use the skb mark, whereas
previously mark=0 was used if the output mark was unspecified. In
mark-based routing schemes such as Android’s, this change in default
behavior causes routing loops or lookup failures.
This patch restores the default behavior of using a 0 mark while still
incorporating the skb mark if the SET_MARK (and SET_MARK_MASK) is
specified.
Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/860150
Fixes: 9b42c1f179a6 ("xfrm: Extend the output_mark to support input direction and masking")
Signed-off-by: Benedict Wong <benedictwong@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The check assumes that in transport mode, the first templates family
must match the address family of the policy selector.
Syzkaller managed to build a template using MODE_ROUTEOPTIMIZATION,
with ipv4-in-ipv6 chain, leading to following splat:
BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x1db/0x1854
Read of size 4 at addr ffff888063e57aa0 by task a.out/2050
xfrm_state_find+0x1db/0x1854
xfrm_tmpl_resolve+0x100/0x1d0
xfrm_resolve_and_create_bundle+0x108/0x1000 [..]
Problem is that addresses point into flowi4 struct, but xfrm_state_find
treats them as being ipv6 because it uses templ->encap_family is used
(AF_INET6 in case of reproducer) rather than family (AF_INET).
This patch inverts the logic: Enforce 'template family must match
selector' EXCEPT for tunnel and BEET mode.
In BEET and Tunnel mode, xfrm_tmpl_resolve_one will have remote/local
address pointers changed to point at the addresses found in the template,
rather than the flowi ones, so no oob read will occur.
Reported-by: 3ntr0py1337@gmail.com
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Recently we run a network test over ipcomp virtual tunnel.We find that
if a ipv4 packet needs fragment, then the peer can't receive
it.
We deep into the code and find that when packet need fragment the smaller
fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
error.
This patch adds compatible support for the ipip process in ipcomp virtual tunnel.
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With very small change to test script we can trigger softlockup due to
bogus assignment of 'p' (policy to be examined) on restart.
Previously the two to-be-merged nodes had same address/prefixlength pair,
so no erase/reinsert was necessary, we only had to append the list from
node a to b.
If prefix lengths are different, the node has to be deleted and re-inserted
into the tree, with the updated prefix length. This was broken; due to
bogus update to 'p' this loops forever.
Add a 'restart' label and use that instead.
While at it, don't perform the unneeded reinserts of the policies that
are already sorted into the 'new' node.
A previous patch in this series made xfrm_policy_inexact_list_reinsert()
use the relative position indicator to sort policies according to age in
case priorities are identical.
Fixes: 6ac098b2a9d30 ("xfrm: policy: add 2nd-level saddr trees for inexact policies")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
... and back to inexact tree.
Repeat ping test after each htresh change: lookup results must not change.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
"newpos" has wrong scope. It must be NULL on each iteration of the loop.
Otherwise, when policy is to be inserted at the start, we would instead
insert at point found by the previous loop-iteration instead.
Also, we need to unlink the policy before we reinsert it to the new node,
else we can get next-points-to-self loops.
Because policies are only ordered by priority it is irrelevant which policy
is "more recent" except when two policies have same priority.
(the more recent one is placed after the older one).
In these cases, we can use the ->pos id number to know which one is the
'older': the higher the id, the more recent the policy.
So we only need to unlink all policies from the node that is about to be
removed, and insert them to the replacement node.
Fixes: 9cf545ebd591da ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
An xfrm hash rebuild has to reset the inexact policy list before the
policies get re-inserted: A change of hash thresholds will result in
policies to get moved from inexact tree to the policy hash table.
If the thresholds are increased again later, they get moved from hash
table to inexact tree.
We must unlink all policies from the inexact tree before re-insertion.
Otherwise 'migrate' may find policies that are in main hash table a
second time, when it searches the inexact lists.
Furthermore, re-insertion without deletion can cause elements ->next to
point back to itself, causing soft lockups or double-frees.
Reported-by: syzbot+9d971dd21eb26567036b@syzkaller.appspotmail.com
Fixes: 9cf545ebd591da ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Hash rebuild will re-set all the inexact entries, then re-insert them.
Lookups that can occur in parallel will therefore not find any policies.
This was safe when lookups were still guarded by rwlock.
After rcu-ification, lookups check the hash_generation seqcount to detect
when a hash resize takes place. Hash rebuild missed the needed increment.
Hash resizes and hash rebuilds cannot occur in parallel (both acquire
hash_resize_mutex), so just increment xfrm_hash_generation, like resize.
Fixes: a7c44247f704e3 ("xfrm: policy: make xfrm_policy_lookup_bytype lockless")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This function was modeled on the 'exact' insert one, which did not use
the rcu variant either.
When I fixed the 'exact' insert I forgot to propagate this to my
development tree, so the inexact variant retained the bug.
Fixes: 9cf545ebd591d ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The existing script lacks a policy pattern that triggers 'tree node
merges' in the kernel.
Consider adding policy affecting following subnet:
pol1: dst 10.0.0.0/22
pol2: dst 10.0.0.0/23 # adds to existing 10.0.0.0/22 node
-> no problems here. But now, lets consider reverse order:
pol1: dst 10.0.0.0/24
pol2: dst 10.0.0.0/23 # CANNOT add to existing node
When second policy gets added, the kernel must check that the new node
("10.0.0.0/23") doesn't overlap with any existing subnet.
Example:
dst 10.0.0.0/24
dst 10.0.0.1/24
dst 10.0.0.0/23
When the third policy gets added, the kernel must replace the nodes for
the 10.0.0.0/24 and 10.0.0.1/24 policies with a single one and must merge
all the subtrees/lists stored in those nodes into the new node.
The existing test cases only have overlaps with a single node, so no
merging takes place (we can always remove the 'old' node and replace
it with the new subnet prefix).
Add a few 'block policies' in a pattern that triggers this, with a priority
that will make kernel prefer the 'esp' rules.
Make sure the 'tunnel ping' tests still pass after they have been added.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull KVM fixes from Paolo Bonzini:
"Quite a few fixes for x86: nested virtualization save/restore, AMD
nested virtualization and virtual APIC, 32-bit fixes, an important fix
to restore operation on older processors, and a bunch of hyper-v
bugfixes. Several are marked stable.
There are also fixes for GCC warnings and for a GCC/objtool interaction"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Mark expected switch fall-throughs
KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
KVM: selftests: check returned evmcs version range
x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
kvm: selftests: Fix region overlap check in kvm_util
kvm: vmx: fix some -Wmissing-prototypes warnings
KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
svm: Fix AVIC incomplete IPI emulation
svm: Add warning message for AVIC IPI invalid target
KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
KVM: x86: Fix PV IPIs for 32-bit KVM host
x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
kvm: x86/vmx: Use kzalloc for cached_vmcs12
KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
KVM: x86: Fix single-step debugging
x86/kvm/hyper-v: don't announce GUEST IDLE MSR support
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
arch/x86/kvm/lapic.c:1037:27: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/lapic.c:1876:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/hyperv.c:1637:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/svm.c:4396:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/mmu.c:4372:36: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:3835:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:7938:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:2015:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:1773:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.
The reason of having -I. here is to make the incorrectly set
TRACE_INCLUDE_PATH working.
As the comment block in include/trace/define_trace.h says,
TRACE_INCLUDE_PATH should be a relative path to the define_trace.h
Fix the TRACE_INCLUDE_PATH, and remove the iffy include paths.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Check that KVM_CAP_HYPERV_ENLIGHTENED_VMCS returns correct version range.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Commit e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version()
helper") broke EVMCS enablement: to set vmcs_version we now call
nested_get_evmcs_version() but this function checks
enlightened_vmcs_enabled flag which is not yet set so we end up returning
zero.
Fix the issue by re-arranging things in nested_enable_evmcs().
Fixes: e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
...along with the function's STACK_FRAME_NON_STANDARD tag. Moving the
asm blob results in a significantly smaller amount of code that is
marked with STACK_FRAME_NON_STANDARD, which makes it far less likely
that gcc will split the function and trigger a spurious objtool warning.
As a bonus, removing STACK_FRAME_NON_STANDARD from vmx_vcpu_run() allows
the bulk of code to be properly checked by objtool.
Because %rbp is not loaded via VMCS fields, vmx_vcpu_run() must manually
save/restore the host's RBP and load the guest's RBP prior to calling
vmx_vmenter(). Modifying %rbp triggers objtool's stack validation code,
and so vmx_vcpu_run() is tagged with STACK_FRAME_NON_STANDARD since it's
impossible to avoid modifying %rbp.
Unfortunately, vmx_vcpu_run() is also a gigantic function that gcc will
split into separate functions, e.g. so that pieces of the function can
be inlined. Splitting the function means that the compiled Elf file
will contain one or more vmx_vcpu_run.part.* functions in addition to
a vmx_vcpu_run function. Depending on where the function is split,
objtool may warn about a "call without frame pointer save/setup" in
vmx_vcpu_run.part.* since objtool's stack validation looks for exact
names when whitelisting functions tagged with STACK_FRAME_NON_STANDARD.
Up until recently, the undesirable function splitting was effectively
blocked because vmx_vcpu_run() was tagged with __noclone. At the time,
__noclone had an unintended side effect that put vmx_vcpu_run() into a
separate optimization unit, which in turn prevented gcc from inlining
the function (or any of its own function calls) and thus eliminated gcc's
motivation to split the function. Removing the __noclone attribute
allowed gcc to optimize vmx_vcpu_run(), exposing the objtool warning.
Kudos to Qian Cai for root causing that the fnsplit optimization is what
caused objtool to complain.
Fixes: 453eafbe65f7 ("KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines")
Tested-by: Qian Cai <cai@lca.pw>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix a call to userspace_mem_region_find to conform to its spec of
taking an inclusive, inclusive range. It was previously being called
with an inclusive, exclusive range. Also remove a redundant region bounds
check in vm_userspace_mem_region_add. Region overlap checking is already
performed by the call to userspace_mem_region_find.
Tested: Compiled tools/testing/selftests/kvm with -static
Ran all resulting test binaries on an Intel Haswell test machine
All tests passed
Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We get some warnings when building kernel with W=1:
arch/x86/kvm/vmx/vmx.c:426:5: warning: no previous prototype for ‘kvm_fill_hv_flush_list_func’ [-Wmissing-prototypes]
arch/x86/kvm/vmx/nested.c:58:6: warning: no previous prototype for ‘init_vmcs_shadow_fields’ [-Wmissing-prototypes]
Make them static to fix this.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
to L1
kvm-unit-tests' eventinj "NMI failing on IDT" test results in NMI being
delivered to the host (L1) when it's running nested. The problem seems to
be: svm_complete_interrupts() raises 'nmi_injected' flag but later we
decide to reflect EXIT_NPF to L1. The flag remains pending and we do NMI
injection upon entry so it got delivered to L1 instead of L2.
It seems that VMX code solves the same issue in prepare_vmcs12(), this was
introduced with code refactoring in commit 5f3d5799974b ("KVM: nVMX: Rework
event injection and recovery").
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In case of incomplete IPI with invalid interrupt type, the current
SVM driver does not properly emulate the IPI, and fails to boot
FreeBSD guests with multiple vcpus when enabling AVIC.
Fix this by update APIC ICR high/low registers, which also
emulate sending the IPI.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Print warning message when IPI target ID is invalid due to one of
the following reasons:
* In logical mode: cluster > max_cluster (64)
* In physical mode: target > max_physical (512)
* Address is not present in the physical or logical ID tables
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
KVM hypercalls return a negative value error code in case of a fatal
error, e.g. when the hypercall isn't supported or was made with invalid
parameters. WARN_ONCE on fatal errors when sending PV IPIs as any such
error all but guarantees an SMP system will hang due to a missing IPI.
Fixes: aaffcfd1e82d ("KVM: X86: Implement PV IPIs in linux guest")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The recognition of the KVM_HC_SEND_IPI hypercall was unintentionally
wrapped in "#ifdef CONFIG_X86_64", causing 32-bit KVM hosts to reject
any and all PV IPI requests despite advertising the feature. This
results in all KVM paravirtualized guests hanging during SMP boot due
to IPIs never being delivered.
Fixes: 4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We shouldn't probably be suggesting using Enlightened VMCS when it's not
enabled (not supported from guest's point of view). Hyper-V on KVM seems
to be fine either way but let's be consistent.
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
System reset through synthetic MSR is not recommended neither by genuine
Hyper-V nor my QEMU.
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This changes the allocation of cached_vmcs12 to use kzalloc instead of
kmalloc. This removes the information leak found by Syzkaller (see
Reported-by) in this case and prevents similar leaks from happening
based on cached_vmcs12.
It also changes vmx_get_nested_state to copy out the full 4k VMCS12_SIZE
in copy_to_user rather than only the size of the struct.
Tested: rebuilt against head, booted, and ran the syszkaller repro
https://syzkaller.appspot.com/text?tag=ReproC&x=174efca3400000 without
observing any problems.
Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com
Fixes: 8fcc4b5923af5de58b80b53a069453b135693304
Cc: stable@vger.kernel.org
Signed-off-by: Tom Roeder <tmroeder@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
Fix a recently introduced bug that results in the wrong VMCS control
field being updated when applying a IA32_PERF_GLOBAL_CTRL errata.
Fixes: c73da3fcab43 ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls")
Reported-by: Harald Arnesen <harald@skogtun.org>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The single-step debugging of KVM guests on x86 is broken: if we run
gdb 'stepi' command at the breakpoint when the guest interrupts are
enabled, RIP always jumps to native_apic_mem_write(). Then other
nasty effects follow.
Long investigation showed that on Jun 7, 2017 the
commit c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
be called without X86_EFLAGS_TF set.
Let's fix it. Please consider that for -stable.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: stable@vger.kernel.org
Fixes: c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
HV_X64_MSR_GUEST_IDLE_AVAILABLE appeared in kvm_vcpu_ioctl_get_hv_cpuid()
by mistake: it announces support for HV_X64_MSR_GUEST_IDLE (0x400000F0)
which we don't support in KVM (yet).
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Pull dma-mapping fix from Christoph Hellwig:
"Fix a xen-swiotlb regression on arm64"
* tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping:
arm64/xen: fix xen-swiotlb cache flushing
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Xen-swiotlb hooks into the arm/arm64 arch code through a copy of the DMA
DMA mapping operations stored in the struct device arch data.
Switching arm64 to use the direct calls for the merged DMA direct /
swiotlb code broke this scheme. Replace the indirect calls with
direct-calls in xen-swiotlb as well to fix this problem.
Fixes: 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct")
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A fix for namespace label support for non-Intel NVDIMMs that implement
the ACPI standard label method.
This has apparently never worked and could wait for v5.1. However it
has enough visibility with hardware vendors [1] and distro bug
trackers [2], and low enough risk that I decided it should go in for
-rc4. The other fixups target the new, for v5.0, nvdimm security
functionality. The larger init path fixup closes a memory leak and a
potential userspace lockup due to missed notifications.
[1] https://github.com/pmem/ndctl/issues/78
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1811785
These have all soaked in -next for a week with no reported issues.
Summary:
- Fix support for NVDIMMs that implement the ACPI standard label
methods.
- Fix error handling for security overwrite (memory leak / userspace
hang condition), and another one-line security cleanup"
* tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
acpi/nfit: Fix command-supported detection
acpi/nfit: Block function zero DSMs
libnvdimm/security: Require nvdimm_security_setup_events() to succeed
nfit_test: fix security state pull for nvdimm security nfit_test
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The _DSM function number validation only happens to succeed when the
generic Linux command number translation corresponds with a
DSM-family-specific function number. This breaks NVDIMM-N
implementations that correctly implement _LSR, _LSW, and _LSI, but do
not happen to publish support for DSM function numbers 4, 5, and 6.
Recall that the support for _LS{I,R,W} family of methods results in the
DIMM being marked as supporting those command numbers at
acpi_nfit_register_dimms() time. The DSM function mask is only used for
ND_CMD_CALL support of non-NVDIMM_FAMILY_INTEL devices.
Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command...")
Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/issues/78
Reported-by: Sujith Pandel <sujith_pandel@dell.com>
Tested-by: Sujith Pandel <sujith_pandel@dell.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
In preparation for using function number 0 as an error value, prevent it
from being considered a valid function value by acpi_nfit_ctl().
Cc: <stable@vger.kernel.org>
Cc: stuart hayes <stuart.w.hayes@gmail.com>
Fixes: e02fb7264d8a ("nfit: add Microsoft NVDIMM DSM command set...")
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|