summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core
Commit message (Collapse)AuthorAgeFilesLines
...
* | | RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLBJason Gunthorpe2019-08-201-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ODP is enabled with IB_ACCESS_HUGETLB then the required pages should be calculated based on the extent of the MR, which is rounded to the nearest huge page alignment. Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-5-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | RDMA/core: Fix error code in stat_get_doit_qp()Dan Carpenter2019-08-121-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to set the error codes on these paths. Currently the only possible error code is -EMSGSIZE so that's what the patch uses. Fixes: 83c2c1fcbd08 ("RDMA/nldev: Allow get counter mode through RDMA netlink") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190809101311.GA17867@mwanda Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | RDMA/counter: Prevent QP counter binding if counters unsupportedMark Zhang2019-08-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of rdma_counter_init() fails, counter allocation and QP bind should not be allowed. Fixes: 413d3347503b ("RDMA/counter: Add set/clear per-port auto mode support") Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support") Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190807101819.7581-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/mlx5: Fix implicit MR release flowYishai Hadas2019-08-071-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once implicit MR is being called to be released by ib_umem_notifier_release() its leaves were marked as "dying". However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is called, it skips running the mr_leaf_free_action (i.e. umem_odp->work) when those leaves were marked as "dying". As such ib_umem_release() for the leaves won't be called and their MRs will be leaked as well. When an application exits/killed without calling dereg_mr we might hit the above flow. This fatal scenario is reported by WARN_ON() upon mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the call trace can be seen below. Originally the "dying" mark as part of ib_umem_notifier_release() was introduced to prevent pagefault_mr() from returning a success response once this happened. However, we already have today the completion mechanism so no need for that in those flows any more. Even in case a success response will be returned the firmware will not find the pages and an error will be returned in the following call as a released mm will cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero(). Fix the above issue by dropping the "dying" from the above flows. The other flows that are using "dying" are still needed it for their synchronization purposes. WARNING: CPU: 1 PID: 7218 at drivers/infiniband/hw/mlx5/main.c:2004 mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib] CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G E 5.2.0-rc6+ #13 Call Trace: uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs] ib_uverbs_close+0x1f/0x80 [ib_uverbs] __fput+0xbe/0x250 task_work_run+0x88/0xa0 do_exit+0x2cb/0xc30 ? __fput+0x14b/0x250 do_group_exit+0x39/0xb0 get_signal+0x191/0x920 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_csk_accept+0x229/0x2f0 do_signal+0x36/0x5e0 ? put_unused_fd+0x5b/0x70 ? __sys_accept4+0x1a6/0x1e0 ? inet_hash+0x35/0x40 ? release_sock+0x43/0x90 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_listen+0x9f/0x120 exit_to_usermode_loop+0x5c/0xc6 do_syscall_64+0x182/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mad: Fix use-after-free in ib mad completion handlingJack Morgenstein2019-08-011-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We encountered a use-after-free bug when unloading the driver: [ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862 [ 3562.118385] [ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G OE 5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1 [ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] [ 3562.124383] Call Trace: [ 3562.125640] dump_stack+0x9a/0xeb [ 3562.126911] print_address_description+0xe3/0x2e0 [ 3562.128223] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.129545] __kasan_report+0x15c/0x1df [ 3562.130866] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.132174] kasan_report+0xe/0x20 [ 3562.133514] ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.134835] ? find_mad_agent+0xa00/0xa00 [ib_core] [ 3562.136158] ? qlist_free_all+0x51/0xb0 [ 3562.137498] ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib] [ 3562.138833] ? quarantine_reduce+0x1fa/0x270 [ 3562.140171] ? kasan_unpoison_shadow+0x30/0x40 [ 3562.141522] ib_mad_recv_done+0xdf6/0x3000 [ib_core] [ 3562.142880] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.144277] ? ib_mad_send_done+0x1810/0x1810 [ib_core] [ 3562.145649] ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib] [ 3562.147008] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.148380] ? debug_object_deactivate+0x2b9/0x4a0 [ 3562.149814] __ib_process_cq+0xe2/0x1d0 [ib_core] [ 3562.151195] ib_cq_poll_work+0x45/0xf0 [ib_core] [ 3562.152577] process_one_work+0x90c/0x1860 [ 3562.153959] ? pwq_dec_nr_in_flight+0x320/0x320 [ 3562.155320] worker_thread+0x87/0xbb0 [ 3562.156687] ? __kthread_parkme+0xb6/0x180 [ 3562.158058] ? process_one_work+0x1860/0x1860 [ 3562.159429] kthread+0x320/0x3e0 [ 3562.161391] ? kthread_park+0x120/0x120 [ 3562.162744] ret_from_fork+0x24/0x30 ... [ 3562.187615] Freed by task 31682: [ 3562.188602] save_stack+0x19/0x80 [ 3562.189586] __kasan_slab_free+0x11d/0x160 [ 3562.190571] kfree+0xf5/0x2f0 [ 3562.191552] ib_mad_port_close+0x200/0x380 [ib_core] [ 3562.192538] ib_mad_remove_device+0xf0/0x230 [ib_core] [ 3562.193538] remove_client_context+0xa6/0xe0 [ib_core] [ 3562.194514] disable_device+0x14e/0x260 [ib_core] [ 3562.195488] __ib_unregister_device+0x79/0x150 [ib_core] [ 3562.196462] ib_unregister_device+0x21/0x30 [ib_core] [ 3562.197439] mlx4_ib_remove+0x162/0x690 [mlx4_ib] [ 3562.198408] mlx4_remove_device+0x204/0x2c0 [mlx4_core] [ 3562.199381] mlx4_unregister_interface+0x49/0x1d0 [mlx4_core] [ 3562.200356] mlx4_ib_cleanup+0xc/0x1d [mlx4_ib] [ 3562.201329] __x64_sys_delete_module+0x2d2/0x400 [ 3562.202288] do_syscall_64+0x95/0x470 [ 3562.203277] entry_SYSCALL_64_after_hwframe+0x49/0xbe The problem was that the MAD PD was deallocated before the MAD CQ. There was completion work pending for the CQ when the PD got deallocated. When the mad completion handling reached procedure ib_mad_post_receive_mads(), we got a use-after-free bug in the following line of code in that procedure: sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey; (the pd pointer in the above line is no longer valid, because the pd has been deallocated). We fix this by allocating the PD before the CQ in procedure ib_mad_port_open(), and deallocating the PD after freeing the CQ in procedure ib_mad_port_close(). Since the CQ completion work queue is flushed during ib_free_cq(), no completions will be pending for that CQ when the PD is later deallocated. Note that freeing the CQ before deallocating the PD is the practice in the ULPs. Fixes: 4be90bc60df4 ("IB/mad: Remove ib_get_dma_mr calls") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190801121449.24973-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | RDMA/restrack: Track driver QP types in resource trackerGal Pressman2019-08-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check for QP type different than XRC has excluded driver QP types from the resource tracker. As a result, "rdma resource show" user command would not show opened driver QPs which does not reflect the real state of the system. Check QP type explicitly instead of assuming enum values/ordering. Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190801104354.11417-1-galpress@amazon.com Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | RDMA/devices: Remove the lock around remove_client_contextJason Gunthorpe2019-08-011-21/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the complexity of client->remove() callbacks it is desirable to not hold any locks while calling them. Remove the last one by tracking only the highest client ID and running backwards from there over the xarray. Since the only purpose of that lock was to protect the linked list, we can drop the lock. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081841.32345-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | RDMA/devices: Do not deadlock during client removalJason Gunthorpe2019-08-011-13/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lockdep reports: WARNING: possible circular locking dependency detected modprobe/302 is trying to acquire lock: 0000000007c8919c ((wq_completion)ib_cm){+.+.}, at: flush_workqueue+0xdf/0x990 but task is already holding lock: 000000002d3d2ca9 (&device->client_data_rwsem){++++}, at: remove_client_context+0x79/0xd0 [ib_core] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&device->client_data_rwsem){++++}: down_read+0x3f/0x160 ib_get_net_dev_by_params+0xd5/0x200 [ib_core] cma_ib_req_handler+0x5f6/0x2090 [rdma_cm] cm_process_work+0x29/0x110 [ib_cm] cm_req_handler+0x10f5/0x1c00 [ib_cm] cm_work_handler+0x54c/0x311d [ib_cm] process_one_work+0x4aa/0xa30 worker_thread+0x62/0x5b0 kthread+0x1ca/0x1f0 ret_from_fork+0x24/0x30 -> #1 ((work_completion)(&(&work->work)->work)){+.+.}: process_one_work+0x45f/0xa30 worker_thread+0x62/0x5b0 kthread+0x1ca/0x1f0 ret_from_fork+0x24/0x30 -> #0 ((wq_completion)ib_cm){+.+.}: lock_acquire+0xc8/0x1d0 flush_workqueue+0x102/0x990 cm_remove_one+0x30e/0x3c0 [ib_cm] remove_client_context+0x94/0xd0 [ib_core] disable_device+0x10a/0x1f0 [ib_core] __ib_unregister_device+0x5a/0xe0 [ib_core] ib_unregister_device+0x21/0x30 [ib_core] mlx5_ib_stage_ib_reg_cleanup+0x9/0x10 [mlx5_ib] __mlx5_ib_remove+0x3d/0x70 [mlx5_ib] mlx5_ib_remove+0x12e/0x140 [mlx5_ib] mlx5_remove_device+0x144/0x150 [mlx5_core] mlx5_unregister_interface+0x3f/0xf0 [mlx5_core] mlx5_ib_cleanup+0x10/0x3a [mlx5_ib] __x64_sys_delete_module+0x227/0x350 do_syscall_64+0xc3/0x6a4 entry_SYSCALL_64_after_hwframe+0x49/0xbe Which is due to the read side of the client_data_rwsem being obtained recursively through a work queue flush during cm client removal. The lock is being held across the remove in remove_client_context() so that the function is a fence, once it returns the client is removed. This is required so that the two callers do not proceed with destruction until the client completes removal. Instead of using client_data_rwsem use the existing device unregistration refcount and add a similar client unregistration (client->uses) refcount. This will fence the two unregistration paths without holding any locks. Cc: <stable@vger.kernel.org> Fixes: 921eab1143aa ("RDMA/devices: Re-organize device.c locking") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081841.32345-2-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/core: Add mitigation for Spectre V1Luck, Tony2019-08-011-1/+5
| |/ |/| | | | | | | | | | | | | | | | | | | | | Some processors may mispredict an array bounds check and speculatively access memory that they should not. With a user supplied array index we like to play things safe by masking the value with the array size before it is used as an index. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20190731043957.GA1600@agluck-desk2.amr.corp.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/counters: Always initialize the port counter objectParav Pandit2019-07-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Port counter objects should be initialized even if alloc_stats is unsupported, otherwise QP bind operations in user space can trigger a NULL pointer deference if they try to bind QP on RDMA device which doesn't support counters. Fixes: f34a55e497e8 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read") Link: https://lore.kernel.org/r/20190723065733.4899-11-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/core: Fix querying total rdma statsParav Pandit2019-07-251-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | rdma_counter_init() may fail for a device. In such case while calculating total sum, ignore NULL hstats. This fixes below observed call trace. BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 8000001009b30067 P4D 8000001009b30067 PUD 10549c9067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 55 PID: 20887 Comm: cat Kdump: loaded Not tainted 5.2.0-rc6-jdc+ #13 RIP: 0010:rdma_counter_get_hwstat_value+0xf2/0x150 [ib_core] Call Trace: show_hw_stats+0x5e/0x130 [ib_core] dev_attr_show+0x15/0x50 sysfs_kf_seq_show+0xc6/0x1a0 seq_read+0x132/0x370 vfs_read+0x89/0x140 ksys_read+0x5c/0xd0 do_syscall_64+0x5a/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: f34a55e497e8 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read") Link: https://lore.kernel.org/r/20190723065733.4899-10-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2019-07-1523-1749/+2117
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "A smaller cycle this time. Notably we see another new driver, 'Soft iWarp', and the deletion of an ancient unused driver for nes. - Revise and simplify the signature offload RDMA MR APIs - More progress on hoisting object allocation boiler plate code out of the drivers - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib, i40iw - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc conversion - Removal of obsolete ib_ucm chardev and nes driver - netlink based discovery of chardevs and autoloading of the modules providing them - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma - New driver 'siw' for software based iWarp running on top of netdev, much like rxe's software RoCE. - mlx5 feature to report events in their raw devx format to userspace - Expose per-object counters through rdma tool - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core from netdev" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits) RMDA/siw: Require a 64 bit arch RDMA/siw: Mark expected switch fall-throughs RDMA/core: Fix -Wunused-const-variable warnings rdma/siw: Remove set but not used variable 's' rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS RDMA/siw: Add missing rtnl_lock around access to ifa rdma/siw: Use proper enumerated type in map_cqe_status RDMA/siw: Remove unnecessary kthread create/destroy printouts IB/rdmavt: Fix variable shadowing issue in rvt_create_cq RDMA/core: Fix race when resolving IP address RDMA/core: Make rdma_counter.h compile stand alone IB/core: Work on the caller socket net namespace in nldev_newlink() RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM RDMA/mlx5: Set RDMA DIM to be enabled by default RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink RDMA/core: Provide RDMA DIM support for ULPs linux/dim: Implement RDMA adaptive moderation (DIM) IB/mlx5: Report correctly tag matching rendezvous capability docs: infiniband: add it to the driver-api bookset IB/mlx5: Implement VHCA tunnel mechanism in DEVX ...
| * RDMA/core: Fix -Wunused-const-variable warningsQian Cai2019-07-111-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit below introduced a few compilation warnings. In file included from ./include/rdma/ib_verbs.h:64, from ./include/linux/mlx5/device.h:37, from ./include/linux/mlx5/driver.h:51, from drivers/net/ethernet/mellanox/mlx5/core/uar.c:36: ./include/linux/dim.h:378:1: warning: 'rdma_dim_prof' defined but not used [-Wunused-const-variable=] rdma_dim_prof[RDMA_DIM_PARAMS_NUM_PROFILES] = { ^~~~~~~~~~~~~ In file included from ./include/rdma/ib_verbs.h:64, from ./include/linux/mlx5/device.h:37, from ./include/linux/mlx5/driver.h:51, from drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:37: ./include/linux/dim.h:378:1: warning: 'rdma_dim_prof' defined but not used [-Wunused-const-variable=] rdma_dim_prof[RDMA_DIM_PARAMS_NUM_PROFILES] = { ^~~~~~~~~~~~~ Since only ib_cq_rdma_dim_work() in drivers/infiniband/core/cq.c uses it, just move the definition over there. Fixes: f4915455dcf0 ("linux/dim: Implement RDMA adaptive moderation (DIM)") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/core: Fix race when resolving IP addressDag Moxnes2019-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the neighbour lock when copying the MAC address from the neighbour data struct in dst_fetch_ha. When not using the lock, it is possible for the function to race with neigh_update(), causing it to copy an torn MAC address: rdma_resolve_addr() rdma_resolve_ip() addr_resolve() addr_resolve_neigh() fetch_ha() dst_fetch_ha() memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN) and net_ioctl() arp_ioctl() arp_rec_delete() arp_invalidate() neigh_update() __neigh_update() memcpy(&neigh->ha, lladdr, dev->addr_len) It is possible to provoke this error by calling rdma_resolve_addr() in a tight loop, while deleting the corresponding ARP entry in another tight loop. Fixes: 51d45974515c ("infiniband: addr: Consolidate code to fetch neighbour hardware address from dst.") Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com> Signed-off-by: HĂĄkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/core: Work on the caller socket net namespace in nldev_newlink()Parav Pandit2019-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | While creating new RDMA devices based on netdevice name, consider the net namespace of the caller skb's socket similar to rest of the doit() callbacks and nldev_dellink() which deletes the RDMA device created using nldev_newlink(). Fixes: 3856ec4b93c94 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlinkYamin Friedman2019-07-083-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added parameter in ib_device for enabling dynamic interrupt moderation so that it can be configured in userspace using rdma tool. In order to set adaptive-moderation for an ib device the command is: rdma dev set [DEV] adaptive-moderation [on|off] Please set on/off. rdma dev show 0: mlx5_0: node_type ca fw 16.26.0055 node_guid 248a:0703:00a5:29d0 sys_image_guid 248a:0703:00a5:29d0 adaptive-moderation on rdma resource show cq dev mlx5_0 cqn 0 cqe 1023 users 4 poll-ctx UNBOUND_WORKQUEUE adaptive-moderation off comm [ib_core] Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/core: Provide RDMA DIM support for ULPsYamin Friedman2019-07-081-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added the interface in the infiniband driver that applies the rdma_dim adaptive moderation. There is now a special function for allocating an ib_cq that uses rdma_dim. Performance improvement (ConnectX-5 100GbE, x86) running FIO benchmark over NVMf between two equal end-hosts with 56 cores across a Mellanox switch using null_blk device: READS without DIM: blk size | BW | IOPS | 99th percentile latency | 99.99th latency 512B | 3.8GiB/s | 7.7M | 1401 usec | 2442 usec 4k | 7.0GiB/s | 1.8M | 4817 usec | 6587 usec 64k | 10.7GiB/s| 175k | 9896 usec | 10028 usec IO WRITES without DIM: blk size | BW | IOPS | 99th percentile latency | 99.99th latency 512B | 3.6GiB/s | 7.5M | 1434 usec | 2474 usec 4k | 6.3GiB/s | 1.6M | 938 usec | 1221 usec 64k | 10.7GiB/s| 175k | 8979 usec | 12780 usec IO READS with DIM: blk size | BW | IOPS | 99th percentile latency | 99.99th latency 512B | 4GiB/s | 8.2M | 816 usec | 889 usec 4k | 10.1GiB/s| 2.65M| 3359 usec | 5080 usec 64k | 10.7GiB/s| 175k | 9896 usec | 10028 usec IO WRITES with DIM: blk size | BW | IOPS | 99th percentile latency | 99.99th latency 512B | 3.9GiB/s | 8.1M | 799 usec | 922 usec 4k | 9.6GiB/s | 2.5M | 717 usec | 1004 usec 64k | 10.7GiB/s| 176k | 8586 usec | 12256 usec The rdma_dim algorithm was designed to measure the effectiveness of moderation on the flow in a general way and thus should be appropriate for all RDMA storage protocols. rdma_dim is configured to be the default option based on performance improvement seen after extensive tests. Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/nldev: Allow get default counter statistics through RDMA netlinkMark Zhang2019-07-052-1/+103
| | | | | | | | | | | | | | | | | | This patch adds the ability to return the hwstats of per-port default counters (which can also be queried through sysfs nodes). Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/nldev: Allow get counter mode through RDMA netlinkMark Zhang2019-07-052-1/+78
| | | | | | | | | | | | | | | | | | Provide an option to get current counter mode through RDMA netlink. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/nldev: Allow counter manual mode configration through RDMA netlinkMark Zhang2019-07-051-13/+98
| | | | | | | | | | | | | | | | | | | | Provide an option to allow users to manually bind a qp with a counter through RDMA netlink. Limit it to users with ADMIN capability only. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/counter: Allow manual mode configuration supportMark Zhang2019-07-051-3/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In manual mode a QP is bound to a counter manually. If counter is not specified then a new one will be allocated. Manual mode is enabled when user binds a QP, and disabled when the last manually bound QP is unbound. When auto-mode is turned off and there are counters left, manual mode is enabled so that the user is able to access these counters. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/core: Get sum value of all counters when perform a sysfs stat readMark Zhang2019-07-052-3/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since a QP can only be bound to one counter, then if it is bound to a separate counter, for backward compatibility purpose, the statistic value must be: * stat of default counter + stat of all running allocated counters + stat of all deallocated counters (history stats) Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/netlink: Implement counter dumpit calbackMark Zhang2019-07-053-1/+240
| | | | | | | | | | | | | | | | | | | | This patch adds the ability to return all available counters together with their properties and hwstats. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/nldev: Allow counter auto mode configration through RDMA netlinkMark Zhang2019-07-051-0/+78
| | | | | | | | | | | | | | | | | | | | Provide an option to enable/disable per-port counter auto mode through RDMA netlink. Limit it to users with ADMIN capability only. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/counter: Add "auto" configuration mode supportMark Zhang2019-07-053-0/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | In auto mode all QPs belong to one category are bind automatically to a single counter set. Currently only "qp type" is supported. In this mode the qp counter is set in RST2INIT modification, and when a qp is destroyed the counter is unbound. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/counter: Add set/clear per-port auto mode supportMark Zhang2019-07-053-1/+80
| | | | | | | | | | | | | | | | | | Add an API to support set/clear per-port auto mode. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/restrack: Make is_visible_in_pid_ns() as an APIMark Zhang2019-07-053-13/+16
| | | | | | | | | | | | | | | | | | | | Remove is_visible_in_pid_ns() from nldev.c and make it as a restrack API, so that it can be taken advantage by other parts like counter. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/restrack: Add an API to attach a task to a resourceMark Zhang2019-07-052-0/+16
| | | | | | | | | | | | | | | | | | | | Add rdma_restrack_attach_task() which is able to attach a task other then "current" to a resource. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/restrack: Introduce statistic counterMark Zhang2019-07-051-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Introduce statistic counter as a new resource. It allows a user to monitor specific objects (e.g., QPs) by binding to a counter. In some cases a user counter resource is created with task other then "current", because its creation is done as part of rdmatool call. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/uverbs: remove redundant assignment to variable retColin Ian King2019-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * Merge tag 'v5.2-rc6' into rdma.git for-nextJason Gunthorpe2019-06-289-61/+64
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies in next patches. Resolve conflicts: - Use uverbs_get_cleared_udata() with new cq allocation flow - Continue to delete nes despite SPDX conflict - Resolve list appends in mlx5_command_str() - Use u16 for vport_rule stuff - Resolve list appends in struct ib_client Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/netlink: Audit policy settings for netlink attributesDoug Ledford2019-06-251-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For all string attributes for which we don't currently accept the element as input, we only use it as output, set the string length to RDMA_NLDEV_ATTR_EMPTY_STRING which is defined as 1. That way we will only accept a null string for that element. This will prevent someone from writing a new input routine that uses the element without also updating the policy to have a valid value. Also while there, make sure the existing entries that are valid have the correct policy, if not, correct the policy. Remove unnecessary checks for nla_strlcpy() overflow once the policy has been set correctly. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | docs: infiniband: convert docs to ReST and rename to *.rstMauro Carvalho Chehab2019-06-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The InfiniBand docs are plain text with no markups. So, all we needed to do were to add the title markups and some markup sequences in order to properly parse tables, lists and literal blocks. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/rw: Use IB_WR_REG_MR_INTEGRITY for PI handoverIsrael Rukshin2019-06-241-97/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the old signature handover API with the new one. The new API simplifes PI handover code complexity for ULPs and improve performance. For RW API it will reduce the maximum number of work requests per task and the need of dealing with multiple MRs (and their registrations and invalidations) per task. All the mappings and registration of the data and the protection buffers is done by the LLD using a single WR and a special MR type (IB_MR_TYPE_INTEGRITY) for the PI handover operation. The setup of the tested benchmark (using iSER ULP): - 2 servers with 24 cores (1 initiator and 1 target) - ConnectX-4/ConnectX-5 adapters - 24 target sessions with 1 LUN each - ramdisk backstore - PI active Performance results running fio (24 jobs, 128 iodepth) using write_generate=1 and read_verify=1 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1243.3K/1182.3K 1725.1K/1680.2K 4k 571233/528835 743293/748259 32k 72388/71086 71789/93573 Using write_generate=0 and read_verify=0 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1572.1K/1427.2K 1823.5K/1724.3K 4k 921992/916194 753772/768267 32k 75052/73960 73180/95484 There is a performance degradation when writing big block sizes. Degradation is caused by the complexity of combining multiple indirections and perform RDMA READ operation from it. This will be fixed in the following patches by reducing the indirections if possible. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/rw: Introduce rdma_rw_inv_key helperIsrael Rukshin2019-06-241-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for adding new signature API to the rw-API. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Validate integrity handover device capMax Gurtovoy2019-06-241-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Protect the case that a ULP tries to allocate a QP with signature enabled flag while the LLD doesn't support this feature. While we're here, also move integrity_en attribute from mlx5_qp to ib_qp as a preparation for adding new integrity API to the rw-API (that is part of ib_core module). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Rename signature qp create flag and signature device capabilityIsrael Rukshin2019-06-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename IB_QP_CREATE_SIGNATURE_EN to IB_QP_CREATE_INTEGRITY_EN and IB_DEVICE_SIGNATURE_HANDOVER to IB_DEVICE_INTEGRITY_HANDOVER. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Add an integrity MR pool supportIsrael Rukshin2019-06-242-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for adding new signature API to the rw-API. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Add signature attrs element for ib_mr structureMax Gurtovoy2019-06-242-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This element will describe the needed characteristics for the signature operation per signature enabled memory region (type IB_MR_TYPE_INTEGRITY). Also add meta_length attribute to ib_sig_attrs structure for saving the mapped metadata length (needed for the new API implementation). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Introduce ib_map_mr_sg_pi to map data/protection sgl'sMax Gurtovoy2019-06-242-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function will map the previously dma mapped SG lists for PI (protection information) and data to an appropriate memory region for future registration. The given MR must be allocated as IB_MR_TYPE_INTEGRITY. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Introduce IB_MR_TYPE_INTEGRITY and ib_alloc_mr_integrity APIIsrael Rukshin2019-06-242-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for signature verbs API re-design. In the new design a single MR with IB_MR_TYPE_INTEGRITY type will be used to perform the needed mapping for data integrity operations. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/core: Save the MR type in the ib_mr structureMax Gurtovoy2019-06-243-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for the signature verbs API change. This change is needed since the MR type will define, in the upcoming patches, the need for allocating internal resources in LLD for signature handover related operations. It will also help to make sure that signature related functions are called with an appropriate MR type and fail otherwise. Also introduce new mr types IB_MR_TYPE_USER, IB_MR_TYPE_DMA and IB_MR_TYPE_DM for correctness. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/odp: Do not leak dma maps when working with huge pagesJason Gunthorpe2019-06-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The ib_dma_unmap_page() must match the length of the ib_dma_map_page(), which is based on odp_shift. Otherwise iommu resources under this API will not be properly freed. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/uverbs: Use offsetofend instead of opencodingJason Gunthorpe2019-06-201-6/+3
| | | | | | | | | | | | | | | | | | | | | Discovered this was available already. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA: Check umem pointer validity prior to releaseLeon Romanovsky2019-06-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Update ib_umem_release() to behave similarly to kfree() and allow submitting NULL pointer as safe input to this function. Fixes: a52c8e2469c3 ("RDMA: Clean destroy CQ in drivers do not return errors") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA: Convert destroy_wq to be voidLeon Romanovsky2019-06-201-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | All callers of destroy WQ are always success and there is no need to check their return value, so convert destroy_wq to be void. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/netlink: Resort policy arrayDoug Ledford2019-06-191-75/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sort the netlink policy array by netlink attribute name. This will make it easier in the future to find the entry you are looking for when you need to make changes, or to make sure you don't add the same entry twice. Fix the whitespace while we are there. Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/odp: Fix missed unlock in non-blocking invalidate_startJason Gunthorpe2019-06-181-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If invalidate_start returns with EAGAIN then the umem_rwsem needs to be unlocked as no invalidate_end will be called. Cc: <stable@vger.kernel.org> Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEVJason Gunthorpe2019-06-185-5/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the struct ib_client for all modules exporting cdevs related to the ibdevice to also implement RDMA_NLDEV_CMD_GET_CHARDEV. All cdevs are now autoloadable and discoverable by userspace over netlink instead of relying on sysfs. uverbs also exposes the DRIVER_ID for drivers that are able to support driver id binding in rdma-core. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA: Add NLDEV_GET_CHARDEV to allow char dev discovery and autoloadJason Gunthorpe2019-06-183-0/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow userspace to issue a netlink query against the ib_device for something like "uverbs" and get back the char dev name, inode major/minor, and interface ABI information for "uverbs0". Since we are now in netlink this can also trigger a module autoload to make the uverbs device come into existence. Largely this will let us replace searching and reading inside sysfs to setup devices, and provides an alternative (using driver_id) to device name based provider binding for things like rxe. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
OpenPOWER on IntegriCloud