| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond. If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released). We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().
Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blocks creating a DEVX UMEM with the non applicable access flags
as of ODP, MW_BIND, etc.
Specifically when an ODP flag is used below WARN call trace is issued.
[ 2510.404131] RIP: 0010:__mlx5_ib_populate_pas+0x207/0x220 [mlx5_ib]
...
[ 2510.404143] Call Trace:
[ 2510.404150] ? __kmalloc_node+0x1b3/0x280
[ 2510.404156] ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404158] ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404162] mlx5_ib_populate_pas+0x53/0x60 [mlx5_ib]
[ 2510.404167] mlx5_ib_handler_MLX5_IB_METHOD_DEVX_UMEM_REG+0x273/0x3f0 [mlx5_ib]
Fixes: aeae94579caf ("IB/mlx5: Add DEVX support for memory registration")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since any page fault may be interrupted by a MMU invalidation and implicit
leaf MR may be released during this process. The check for parent value
is unreliable condition for an implicit MR.
Use other condition that we can rely on to determine if MR is implicit.
Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running with KASAN, the following trace is produced:
[ 62.535888]
==================================================================
[ 62.544930] BUG: KASAN: slab-out-of-bounds in
gut_hw_stats+0x122/0x230 [hfi1]
[ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task
kworker/0:1/14
[ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
4.19.0-test-build-kasan+ #8
[ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
[ 62.587951] Workqueue: events work_for_cpu_fn
[ 62.594050] Call Trace:
[ 62.598023] dump_stack+0xc6/0x14c
[ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f
[ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59
[ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.622985] print_address_description+0x6c/0x23c
[ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.636108] kasan_report.cold.6+0x241/0x308
[ 62.642365] get_hw_stats+0x122/0x230 [hfi1]
[ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.655088] ? __kmalloc+0x110/0x240
[ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core]
[ 62.673972] ? show_hfi+0x50/0x50 [hfi1]
[ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core]
[ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core]
[ 62.695340] ? show_hfi+0x50/0x50 [hfi1]
[ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
[ 62.709222] ? __vmalloc_node_range+0x2d0/0x380
[ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.723735] ? vmalloc_node+0x5c/0x70
[ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
[ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
[ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt]
[ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt]
[ 62.766364] ? mutex_unlock+0x1d/0x40
[ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230
[ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt]
[ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
[ 62.796058] ? __get_txreq+0x400/0x400 [hfi1]
[ 62.802969] ? memcpy+0x34/0x50
[ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
[ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1]
[ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1]
[ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
[ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0
[ 62.846429] ? pcie_capability_read_word+0xd0/0xd0
[ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
[ 62.860958] init_one+0x67f/0xae0 [hfi1]
[ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.873876] ? wait_woken+0x130/0x130
[ 62.879860] ? read_word_at_a_time+0xe/0x20
[ 62.886329] ? strscpy+0x14b/0x280
[ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.898405] local_pci_probe+0x70/0xd0
[ 62.904295] ? pci_device_shutdown+0x90/0x90
[ 62.910833] work_for_cpu_fn+0x29/0x40
[ 62.916750] process_one_work+0x584/0x960
[ 62.922974] ? rcu_work_rcufn+0x40/0x40
[ 62.928991] ? __schedule+0x396/0xdc0
[ 62.934806] ? __sched_text_start+0x8/0x8
[ 62.941020] ? pick_next_task_fair+0x68b/0xc60
[ 62.947674] ? run_rebalance_domains+0x260/0x260
[ 62.954471] ? __list_add_valid+0x29/0xa0
[ 62.960607] ? move_linked_works+0x1c7/0x230
[ 62.967077] ?
trace_event_raw_event_workqueue_execute_start+0x140/0x140
[ 62.976248] ? mutex_lock+0xa6/0x100
[ 62.982029] ? __mutex_lock_slowpath+0x10/0x10
[ 62.988795] ? __switch_to+0x37a/0x710
[ 62.994731] worker_thread+0x62e/0x9d0
[ 63.000602] ? max_active_store+0xf0/0xf0
[ 63.006828] ? __switch_to_asm+0x40/0x70
[ 63.012932] ? __switch_to_asm+0x34/0x70
[ 63.019013] ? __switch_to_asm+0x40/0x70
[ 63.025042] ? __switch_to_asm+0x34/0x70
[ 63.031030] ? __switch_to_asm+0x40/0x70
[ 63.037006] ? __schedule+0x396/0xdc0
[ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0
[ 63.049323] ? kthread+0x59/0x1d0
[ 63.054594] ? ret_from_fork+0x35/0x40
[ 63.060257] ? __sched_text_start+0x8/0x8
[ 63.066212] ? schedule+0xcf/0x250
[ 63.071529] ? __wake_up_common+0x110/0x350
[ 63.077794] ? __schedule+0xdc0/0xdc0
[ 63.083348] ? wait_woken+0x130/0x130
[ 63.088963] ? finish_task_switch+0x1f1/0x520
[ 63.095258] ? kasan_unpoison_shadow+0x30/0x40
[ 63.101792] ? __init_waitqueue_head+0xa0/0xd0
[ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18
[ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50
[ 63.121754] ? max_active_store+0xf0/0xf0
[ 63.127753] kthread+0x1ae/0x1d0
[ 63.132894] ? kthread_bind+0x30/0x30
[ 63.138422] ret_from_fork+0x35/0x40
[ 63.146973] Allocated by task 14:
[ 63.152077] kasan_kmalloc+0xbf/0xe0
[ 63.157471] __kmalloc+0x110/0x240
[ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1]
[ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1]
[ 63.175244] init_one+0x551/0xae0 [hfi1]
[ 63.181065] local_pci_probe+0x70/0xd0
[ 63.186759] work_for_cpu_fn+0x29/0x40
[ 63.192310] process_one_work+0x584/0x960
[ 63.198163] worker_thread+0x62e/0x9d0
[ 63.203843] kthread+0x1ae/0x1d0
[ 63.208874] ret_from_fork+0x35/0x40
[ 63.217203] Freed by task 1:
[ 63.221844] __kasan_slab_free+0x12e/0x180
[ 63.227844] kfree+0x92/0x1a0
[ 63.232570] single_release+0x3a/0x60
[ 63.238024] __fput+0x1d9/0x480
[ 63.242911] task_work_run+0x139/0x190
[ 63.248440] exit_to_usermode_loop+0x191/0x1a0
[ 63.254814] do_syscall_64+0x301/0x330
[ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 63.270199] The buggy address belongs to the object at
ffff88080e8d5500
which belongs to the cache kmalloc-4096 of size 4096
[ 63.287247] The buggy address is located 3632 bytes inside of
4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
[ 63.303564] The buggy address belongs to the page:
[ 63.310447] page:ffffea00203a3400 count:1 mapcount:0
mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
[ 63.323102] flags: 0x2fffff80008100(slab|head)
[ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
ffff88081380e840
[ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
0000000000000000
[ 63.350564] page dumped because: kasan: bad access detected
[ 63.361974] Memory state around the buggy address:
[ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
fc fc fc
[ 63.398944] ^
[ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.426099]
==================================================================
The trace happens because get_hw_stats() assumes there is room in the
memory allocated in init_cntrs() to accommodate the driver counters.
Unfortunately, that routine only allocated space for the device
counters.
Fix by insuring the allocation has room for the additional driver
counters.
Cc: <Stable@vger.kernel.org> # v4.14+
Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface")
Reviewed-by: Mike Marciniczyn <mike.marciniszyn@intel.com>
Reviewed-by: Mike Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A recent performance enhancement introduced a latency issue in the
HFI message path. The new algorithm removed a forced call send for
PIO messages and added a forced schedule event for messages larger
than the MTU.
For PIO, the schedule path can introduce thrashing that can
significantly impact the throughput for small messages.
If a message size is within the PIO threshold, always take the send
path.
Fixes: 0b79b27748cb ("IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of posting")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Pagefaults occurred in non-ODP MR are completely valid events, so
initialize return variable to 0.
Fixes: 4d5422a309de ("IB/mlx5: Skip non-ODP MR when handling a page fault")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memory windows are implemented with an indirect MKey, when a page fault
event comes for a MW Mkey we need to find the MR at the end of the list of
the indirect MKeys by iterating on all items from the first to the last.
The offset calculated during this process has to be zeroed after the first
iteration or the next iteration will start from a wrong address, resulting
incorrect ODP faulting behavior.
Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The invalidate range was using PAGE_SIZE instead of the computed 'end',
and had the wrong transformation of page_index due the weird
construction. This can trigger during error unwind and would cause
malfunction.
Inline the code and correct the math.
Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that we call pagefault_single_data_segment() with a MKey
that belongs to a memory region which is not on demand (i.e. pinned
pages). This can happen if, for instance, a WQE that points to multiple
MRs where some of them are ODP MRs and some are not. In this case we
don't need to handle this MR in the ODP context besides reporting success.
Otherwise the code will call pagefault_mr() which will do to_ib_umem_odp()
on a non-ODP MR and thus access out of bounds.
Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current hns driver assigned the first two PBL page addresses from previous
registered MR to the hardware when reregister MR changing the memory
locations occurred. This will lead to PBL addressing error as the PBL has
already been released. This patch fixes this wrong assignment by using the
page address from new allocated PBL.
Fixes: a2c80b7b4119 ("RDMA/hns: Add rereg mr support for hip08")
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If for some reason we failed to query the mr status, we need to make sure
to provide sufficient information for an ambiguous error (guard error on
sector 0).
Fixes: 0a7a08ad6f5f ("IB/iser: Implement check_protection")
Cc: <stable@vger.kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rdmavt uses a crazy system that looses the type checking when assinging
functions to struct ib_device function pointers. Because of this the
signature to this function was not changed when the below commit revised
things.
Fix the signature so we are not calling a function pointer with a
mismatched signature.
Fixes: 477864c8fcd9 ("IB/core: Let create_ah return extended response to user")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the firmware reports a connection width that is not 1x, 4x, 8x or 12x
it causes the driver to fail during initialization.
To prevent this failure every time a new width is introduced to the RDMA
stack, we will set a default 4x width for these widths which ar unknown to
the driver.
This is needed to allow to run old kernels with new firmware.
Cc: <stable@vger.kernel.org> # 4.1
Fixes: 1b5daf11b015 ("IB/mlx5: Avoid using the MAD_IFC command under ISSI > 0 mode")
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extended atomics are supported with RC and XRC QP types, but the commit
citied in the Fixes line added an unneeded check to
to_mlx5_access_flags. This broke XRC QPs.
The following ib_atomic_bw invocation over XRC reproduces the issue:
ib_atomic_bw -d mlx5_1 --connection=XRC --atomic_type=FETCH_AND_ADD
It is safe to remove such checks because the QP type was already checked
in ib_modify_qp_is_ok(), which was previously called from
mlx5_ib_modify_qp.
Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When bnxt_re_ib_reg returns failure, the device structure gets
freed. Driver tries to access the device pointer
after it is freed.
[ 4871.034744] Failed to register with netedev: 0xffffffa1
[ 4871.034765] infiniband (null): Failed to register with IB: 0xffffffea
[ 4871.046430] ==================================================================
[ 4871.046437] BUG: KASAN: use-after-free in bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046439] Write of size 4 at addr ffff880fa8406f48 by task kworker/u48:2/17813
[ 4871.046443] CPU: 20 PID: 17813 Comm: kworker/u48:2 Kdump: loaded Tainted: G B OE 4.20.0-rc1+ #42
[ 4871.046444] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 4871.046447] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[ 4871.046449] Call Trace:
[ 4871.046454] dump_stack+0x91/0xeb
[ 4871.046458] print_address_description+0x6a/0x2a0
[ 4871.046461] kasan_report+0x176/0x2d0
[ 4871.046463] ? bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046466] bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046470] process_one_work+0x216/0x5b0
[ 4871.046471] ? process_one_work+0x189/0x5b0
[ 4871.046475] worker_thread+0x4e/0x3d0
[ 4871.046479] kthread+0x10e/0x140
[ 4871.046480] ? process_one_work+0x5b0/0x5b0
[ 4871.046482] ? kthread_stop+0x220/0x220
[ 4871.046486] ret_from_fork+0x3a/0x50
[ 4871.046492] The buggy address belongs to the page:
[ 4871.046494] page:ffffea003ea10180 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 4871.046495] flags: 0x57ffffc0000000()
[ 4871.046498] raw: 0057ffffc0000000 0000000000000000 ffffea003ea10188 0000000000000000
[ 4871.046500] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 4871.046501] page dumped because: kasan: bad access detected
Avoid accessing the device structure once it is freed.
Fixes: 497158aa5f52 ("RDMA/bnxt_re: Fix the ib_reg failure cleanup")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Driver doesn't release rtnl lock if registration with
L2 driver (bnxt_re_register_netdev) fais and this causes
hang while requesting for the next lock.
[ 371.635416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 371.635417] kworker/u48:1 D 0 634 2 0x80000000
[ 371.635423] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[ 371.635424] Call Trace:
[ 371.635426] ? __schedule+0x36b/0xbd0
[ 371.635429] schedule+0x39/0x90
[ 371.635430] schedule_preempt_disabled+0x11/0x20
[ 371.635431] __mutex_lock+0x45b/0x9c0
[ 371.635433] ? __mutex_lock+0x16d/0x9c0
[ 371.635435] ? bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[ 371.635438] ? wake_up_klogd+0x37/0x40
[ 371.635442] bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[ 371.635447] bnxt_re_task+0xfd/0x180 [bnxt_re]
[ 371.635449] process_one_work+0x216/0x5b0
[ 371.635450] ? process_one_work+0x189/0x5b0
[ 371.635453] worker_thread+0x4e/0x3d0
[ 371.635455] kthread+0x10e/0x140
[ 371.635456] ? process_one_work+0x5b0/0x5b0
[ 371.635458] ? kthread_stop+0x220/0x220
[ 371.635460] ret_from_fork+0x3a/0x50
[ 371.635477] INFO: task NetworkManager:1228 blocked for more than 120 seconds.
[ 371.635478] Tainted: G B OE 4.20.0-rc1+ #42
[ 371.635479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Release the rtnl_lock correctly in the failure path.
Fixes: de5c95d0f518 ("RDMA/bnxt_re: Fix system crash during RDMA resource initialization")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when MAC address is changed, regardless of the netdev reg_state,
GID entries are removed and added to reflect the new MAC address and new
default GID entries.
When a bonding device is used and the underlying PCI device is removed
several netdevice events are generated. Two events of the interest are
CHANGEADDR and UNREGISTER event on lower(slave) netdevice of the bond
netdevice.
Sometimes CHANGEADDR event is generated when netdev state is
UNREGISTERING (after UNREGISTER event is generated). In this scenario, GID
entries for default GIDs are added and never deleted because GID entries
are deleted only when netdev state is < UNREGISTERED.
This leads to non zero reference count on the netdevice. Due to this, PCI
device unbind operation is getting stuck.
To avoid it, when changing mac address, add GID entries only if netdev is
in REGISTERED state.
Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, for IB_WR_LOCAL_INV WR, when the next fence is None, the
current fence will be SMALL instead of Normal Fence.
Without this patch krping doesn't work on CX-5 devices and throws
following error:
The error messages are from CX5 driver are: (from server side)
[ 710.434014] mlx5_0:dump_cqe:278:(pid 2712): dump error cqe
[ 710.434016] 00000000 00000000 00000000 00000000
[ 710.434016] 00000000 00000000 00000000 00000000
[ 710.434017] 00000000 00000000 00000000 00000000
[ 710.434018] 00000000 93003204 100000b8 000524d2
[ 710.434019] krping: cq completion failed with wr_id 0 status 4 opcode 128 vender_err 32
Fixed the logic to set the correct fence type.
Fixes: 6e8484c5cf07 ("RDMA/mlx5: set UMR wqe fence according to HCA cap")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
callbacks"
Revert 5ff7091f5a2ca ("mm, mmu_notifier: annotate mmu notifiers with
blockable invalidate callbacks").
MMU_INVALIDATE_DOES_NOT_BLOCK flags was the only one used and it is no
longer needed since 93065ac753e4 ("mm, oom: distinguish blockable mode for
mmu notifiers"). We now have a full support for per range !blocking
behavior so we can drop the stop gap workaround which the per notifier
flag was used for.
Link: http://lkml.kernel.org/r/20180827112623.8992-4-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull rdma updates from Jason Gunthorpe:
"This has been a smaller cycle with many of the commits being smallish
code fixes and improvements across the drivers.
- Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and
rxe
- Memory window support in hns
- mlx5 user API 'flow mutate/steering' allows accessing the full
packet mangling and matching machinery from user space
- Support inter-working with verbs API calls in the 'devx' mlx5 user
API, and provide options to use devx with less privilege
- Modernize the use of syfs and the device interface to use attribute
groups and cdev properly for uverbs, and clean up some of the core
code's device list management
- More progress on net namespaces for RDMA devices
- Consolidate driver BAR mmapping support into core code helpers and
rework how RDMA holds poitners to mm_struct for get_user_pages
cases
- First pass to use 'dev_name' instead of ib_device->name
- Device renaming for RDMA devices"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (242 commits)
IB/mlx5: Add support for extended atomic operations
RDMA/core: Fix comment for hw stats init for port == 0
RDMA/core: Refactor ib_register_device() function
RDMA/core: Fix unwinding flow in case of error to register device
ib_srp: Remove WARN_ON in srp_terminate_io()
IB/mlx5: Allow scatter to CQE without global signaled WRs
IB/mlx5: Verify that driver supports user flags
IB/mlx5: Support scatter to CQE for DC transport type
RDMA/drivers: Use core provided API for registering device attributes
RDMA/core: Allow existing drivers to set one sysfs group per device
IB/rxe: Remove unnecessary enum values
RDMA/umad: Use kernel API to allocate umad indexes
RDMA/uverbs: Use kernel API to allocate uverbs indexes
RDMA/core: Increase total number of RDMA ports across all devices
IB/mlx4: Add port and TID to MAD debug print
IB/mlx4: Enable debug print of SMPs
RDMA/core: Rename ports_parent to ports_kobj
RDMA/core: Do not expose unsupported counters
IB/mlx4: Refer to the device kobject instead of ports_parent
RDMA/nldev: Allow IB device rename through RDMA netlink
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Extended atomic operations cmp&swp and fetch&add is a Mellanox
feature extending the standard atomic operation to use, varied
operand sizes, as apposed to normal atomic operation that use
an 8 byte operand only.
Extended atomics allows masking the results and arguments.
This patch configures QP to support extended atomic operation
with the maximum size possible, as exposed by HCA capabilities.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When add_port() is done for port == 0, it indicates that ports hardware
counters initialization should be skipped. Reflect so in the comment.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
ib_register_device() does several allocation and initialization
steps. Split it into smaller more readable functions for easy
review and maintenance.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If port pkey list initialization fails, free the port_immutable memory
during cleanup path. Currently it is missed out.
If cache setup fails, free the pkey list during cleanup path.
Fixes: d291f1a65 ("IB/core: Enforce PKey security on QPs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The WARN_ON() is pointless as the rport is placed in SDEV_TRANSPORT_OFFLINE
at that time, so no new commands can be submitted via srp_queuecommand()
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Requester scatter to CQE is restricted to QPs configured to signal
all WRs.
This patch adds ability to enable scatter to cqe (force enable)
in the requester without sig_all, for users who do not want all WRs
signaled but rather just the ones whose data found in the CQE.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Flags sent down from user might not be supported by
running driver.
This might lead to unwanted bugs.
To solve this, added macro to test for unsupported flags.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Scatter to CQE is a HW offload that saves PCI writes by scattering the
payload to the CQE.
This patch extends already existing functionality to support DC
transport type.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Clang warns when an emumerated type is implicitly converted to another.
drivers/infiniband/sw/rxe/rxe.c:106:27: warning: implicit conversion
from enumeration type 'enum rxe_device_param' to different enumeration
type 'enum ib_atomic_cap' [-Wenum-conversion]
rxe->attr.atomic_cap = RXE_ATOMIC_CAP;
~ ^~~~~~~~~~~~~~
drivers/infiniband/sw/rxe/rxe.c:131:22: warning: implicit conversion
from enumeration type 'enum rxe_port_param' to different enumeration
type 'enum ib_port_state' [-Wenum-conversion]
port->attr.state = RXE_PORT_STATE;
~ ^~~~~~~~~~~~~~
drivers/infiniband/sw/rxe/rxe.c:132:24: warning: implicit conversion
from enumeration type 'enum rxe_port_param' to different enumeration
type 'enum ib_mtu' [-Wenum-conversion]
port->attr.max_mtu = RXE_PORT_MAX_MTU;
~ ^~~~~~~~~~~~~~~~
drivers/infiniband/sw/rxe/rxe.c:133:27: warning: implicit conversion
from enumeration type 'enum rxe_port_param' to different enumeration
type 'enum ib_mtu' [-Wenum-conversion]
port->attr.active_mtu = RXE_PORT_ACTIVE_MTU;
~ ^~~~~~~~~~~~~~~~~~~
drivers/infiniband/sw/rxe/rxe.c:151:24: warning: implicit conversion
from enumeration type 'enum rxe_port_param' to different enumeration
type 'enum ib_mtu' [-Wenum-conversion]
ib_mtu_enum_to_int(RXE_PORT_ACTIVE_MTU);
~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~~~~
5 warnings generated.
Use the appropriate values from the expected enumerated type so no
conversion needs to happen then remove the unneeded definitions.
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| | |
Replace custom code to allocate indexes to generic kernel API.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
Replace custom code to allocate indexes to generic kernel API.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
IDA adds overhead to store IDs bitmap with maximal value of IDA
can be upto 2099202 (IDA_MAX = 0x80000000U / IDA_BITMAP_BITS - 1).
However, there is no need to add such enormous number of devices
and it is enough for now to limit it to be 8192.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add said information and make the debug print format consistent.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
IB Subnet Management Packets (SMPs) were excluded from debug prints.
Fixed by enabling print even on QP0 MADs.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Normally kobj objects have kobj suffix to reflect it.
Rename ports_parent to ports_kobj.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the provider driver (such as rdma_rxe) doesn't support pma counters,
avoid exposing its directory similar to optional hw_counters directory.
If core fails to read the PMA counter, return an error so that user can
retry later if needed.
Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
iov sysfs tree is created under ib device at
/sys/class/infiniband/mlx4_0/iov.
And,
ibdev->ports_parent->parent = &ibdev->dev.
Therefore, refer to device's kobject directly instead of
indirect access to it.
Additionally, iov entries are created under device kobject and deleted
before device is removed. There is no need to hold additional reference
to device kobject in provider driver.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Provide an option to rename IB device name through RDMA netlink and
limit it to users with ADMIN capability only.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
Generic implementation of IB device rename function.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ucma users supply timeout in u32 format, it means that any number
with most significant bit set will be converted to negative value
by various rdma_*, cma_* and sa_query functions, which treat timeout
as int.
In the lowest level, the timeout is converted back to be unsigned long.
Remove this ambiguous conversion by updating all function signatures to
receive unsigned long.
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch changes the small number of functions to be aligned to kernel
coding style. It is needed to minimize the diffstat of the following
patch. It doesn't change any functionality.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
cma_resolve_iw_route() doesn't use timeout_ms parameter, so let's remove it.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Schedule MR cache work only after bucket was initialized.
Cc: <stable@vger.kernel.org> # 4.10
Fixes: 49780d42dfc9 ("IB/mlx5: Expose MR cache for mlx5_ib")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing check for failure of cm_init_av_by_path
Fixes: e1444b5a163e ("IB/cm: Fix automatic path migration support")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
IPCB should be cleared before icmp_send, since it may contain data from
previous layers and the data could be misinterpreted as ip header options,
which later caused the ihl to be set to an invalid value and resulted in
the following stack corruption:
[ 1083.031512] ib0: packet len 57824 (> 2048) too long to send, dropping
[ 1083.031843] ib0: packet len 37904 (> 2048) too long to send, dropping
[ 1083.032004] ib0: packet len 4040 (> 2048) too long to send, dropping
[ 1083.032253] ib0: packet len 63800 (> 2048) too long to send, dropping
[ 1083.032481] ib0: packet len 23960 (> 2048) too long to send, dropping
[ 1083.033149] ib0: packet len 63800 (> 2048) too long to send, dropping
[ 1083.033439] ib0: packet len 63800 (> 2048) too long to send, dropping
[ 1083.033700] ib0: packet len 63800 (> 2048) too long to send, dropping
[ 1083.034124] ib0: packet len 63800 (> 2048) too long to send, dropping
[ 1083.034387] ==================================================================
[ 1083.034602] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xf08/0x1310
[ 1083.034798] Write of size 4 at addr ffff880353457c5f by task kworker/u16:0/7
[ 1083.034990]
[ 1083.035104] CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G O 4.19.0-rc5+ #1
[ 1083.035316] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 1083.035573] Workqueue: ipoib_wq ipoib_cm_skb_reap [ib_ipoib]
[ 1083.035750] Call Trace:
[ 1083.035888] dump_stack+0x9a/0xeb
[ 1083.036031] print_address_description+0xe3/0x2e0
[ 1083.036213] kasan_report+0x18a/0x2e0
[ 1083.036356] ? __ip_options_echo+0xf08/0x1310
[ 1083.036522] __ip_options_echo+0xf08/0x1310
[ 1083.036688] icmp_send+0x7b9/0x1cd0
[ 1083.036843] ? icmp_route_lookup.constprop.9+0x1070/0x1070
[ 1083.037018] ? netif_schedule_queue+0x5/0x200
[ 1083.037180] ? debug_show_all_locks+0x310/0x310
[ 1083.037341] ? rcu_dynticks_curr_cpu_in_eqs+0x85/0x120
[ 1083.037519] ? debug_locks_off+0x11/0x80
[ 1083.037673] ? debug_check_no_obj_freed+0x207/0x4c6
[ 1083.037841] ? check_flags.part.27+0x450/0x450
[ 1083.037995] ? debug_check_no_obj_freed+0xc3/0x4c6
[ 1083.038169] ? debug_locks_off+0x11/0x80
[ 1083.038318] ? skb_dequeue+0x10e/0x1a0
[ 1083.038476] ? ipoib_cm_skb_reap+0x2b5/0x650 [ib_ipoib]
[ 1083.038642] ? netif_schedule_queue+0xa8/0x200
[ 1083.038820] ? ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib]
[ 1083.038996] ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib]
[ 1083.039174] process_one_work+0x912/0x1830
[ 1083.039336] ? wq_pool_ids_show+0x310/0x310
[ 1083.039491] ? lock_acquire+0x145/0x3a0
[ 1083.042312] worker_thread+0x87/0xbb0
[ 1083.045099] ? process_one_work+0x1830/0x1830
[ 1083.047865] kthread+0x322/0x3e0
[ 1083.050624] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 1083.053354] ret_from_fork+0x3a/0x50
For instance __ip_options_echo is failing to proceed with invalid srr and
optlen passed from another layer via IPCB
[ 762.139568] IPv4: __ip_options_echo rr=0 ts=0 srr=43 cipso=0
[ 762.139720] IPv4: ip_options_build: IPCB 00000000f3cd969e opt 000000002ccb3533
[ 762.139838] IPv4: __ip_options_echo in srr: optlen 197 soffset 84
[ 762.139852] IPv4: ip_options_build srr=0 is_frag=0 rr_needaddr=0 ts_needaddr=0 ts_needtime=0 rr=0 ts=0
[ 762.140269] ==================================================================
[ 762.140713] IPv4: __ip_options_echo rr=0 ts=0 srr=0 cipso=0
[ 762.141078] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x12ec/0x1680
[ 762.141087] Write of size 4 at addr ffff880353457c7f by task kworker/u16:0/7
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Nullify the resource task struct pointer to ensure that subsequent calls
won't try to release task_struct again.
------------[ cut here ]------------
ODEBUG: free active (active state 1) object type: rcu_head hint:
(null)
WARNING: CPU: 0 PID: 6048 at lib/debugobjects.c:329
debug_print_object+0x16a/0x210 lib/debugobjects.c:326
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 6048 Comm: syz-executor022 Not tainted
4.19.0-rc7-next-20181008+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
panic+0x238/0x4e7 kernel/panic.c:184
__warn.cold.8+0x163/0x1ba kernel/panic.c:536
report_bug+0x254/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:debug_print_object+0x16a/0x210 lib/debugobjects.c:326
Code: 41 88 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 92 00 00 00 48 8b 14
dd
60 02 41 88 4c 89 fe 48 c7 c7 00 f8 40 88 e8 36 2f b4 fd <0f> 0b 83 05
a9
f4 5e 06 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f
RSP: 0018:ffff8801d8c3eda8 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8164d235 RDI: 0000000000000005
RBP: ffff8801d8c3ede8 R08: ffff8801d70aa280 R09: ffffed003b5c3eda
R10: ffffed003b5c3eda R11: ffff8801dae1f6d7 R12: 0000000000000001
R13: ffffffff8939a760 R14: 0000000000000000 R15: ffffffff8840fca0
__debug_check_no_obj_freed lib/debugobjects.c:786 [inline]
debug_check_no_obj_freed+0x3ae/0x58d lib/debugobjects.c:818
kmem_cache_free+0x202/0x290 mm/slab.c:3759
free_task_struct kernel/fork.c:163 [inline]
free_task+0x16e/0x1f0 kernel/fork.c:457
__put_task_struct+0x2e6/0x620 kernel/fork.c:730
put_task_struct include/linux/sched/task.h:96 [inline]
finish_task_switch+0x66c/0x900 kernel/sched/core.c:2715
context_switch kernel/sched/core.c:2834 [inline]
__schedule+0x8d7/0x21d0 kernel/sched/core.c:3480
schedule+0xfe/0x460 kernel/sched/core.c:3524
freezable_schedule include/linux/freezer.h:172 [inline]
futex_wait_queue_me+0x3f9/0x840 kernel/futex.c:2530
futex_wait+0x45c/0xa50 kernel/futex.c:2645
do_futex+0x31a/0x26d0 kernel/futex.c:3528
__do_sys_futex kernel/futex.c:3589 [inline]
__se_sys_futex kernel/futex.c:3557 [inline]
__x64_sys_futex+0x472/0x6a0 kernel/futex.c:3557
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x446549
Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f3a998f5da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 00000000006dbc38 RCX: 0000000000446549
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006dbc38
RBP: 00000000006dbc30 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc3c
R13: 2f646e6162696e69 R14: 666e692f7665642f R15: 00000000006dbd2c
Kernel Offset: disabled
Reported-by: syzbot+71aff6ea121ffefc280f@syzkaller.appspotmail.com
Fixes: ed7a01fd3fd7 ("RDMA/restrack: Release task struct which was hold by CM_ID object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A user can provide a hint which will be attached to the packet and written
to the CQE on receive. This can be used as a way to offload operations
into the HW, for example parsing a packet which is a tunneled packet, and
if so, pass 0x1 as the hint. The software can use that hint to decapsulate
the packet and parse only the inner headers thus saving CPU cycles.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove double error check from create user RQ error flow.
Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs")
Signed-off-by: Gal Pressman <pressmangal@gmail.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Verify that the input DEVX object type matches the created object.
As the obj_id in the firmware is not globally unique the object type must
be considered upon checking for a valid object id.
Once both the type and the id match we know that the lock was taken on the
correct object by the uverbs layer.
Fixes: e662e14d801b ("IB/mlx5: Add DEVX support for modify and query commands")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|