summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core
Commit message (Collapse)AuthorAgeFilesLines
* RDMA/core: Add a netlink command to change net namespace of rdma deviceParav Pandit2019-04-223-6/+65
| | | | | | | | | | | | | | | | Provide an option to change the net namespace of a rdma device through a netlink command. When multiple rdma devices exists in a system, and when containers are used, this will limit rdma device visibility to a specified net namespace. An example command to change net namespace of mlx5_1 device to the previously created net namespace 'foo' is: $ ip netns add foo $ rdma dev set mlx5_1 netns foo Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Introduce a helper function to change net namespace of rdma deviceParav Pandit2019-04-221-0/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a helper function that changes rdma device's net namespace which performs mini disable/enable sequence to have device visible only in assigned net namespace. Device unregistration, device rename and device change net namespace may be invoked concurrently. (a) device unregistration needs to wait if a device change (rename or net namespace change) operation is in progress. (b) device net namespace change should not proceed if the unregistration has started. (c) while one cpu is changing device net namespace, other cpu should not be able to rename or change net namespace. To address above concurrency, (a) Use unreg_mutex to synchronize between ib_unregister_device() and net namespace change operation (b) In cases where unregister_device() has started unregistration before change_netns got chance to acquire unreg_mutex, validate the refcount - if it dropped to zero, abort the net namespace change operation. Finally use the helper function to change net namespace of ib device to move the device back to init_net when such net is deleted. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Avoid freeing netdevs in disable_device()Parav Pandit2019-04-221-3/+4
| | | | | | | | So we can use the disable_device() helper while changing the net namespace of the rdma device in a subsequent patch, move free_netdevs() out of it. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/umem: Use correct value for SG entries in sg_copy_to_buffer()Shiraz Saleem2019-04-081-2/+2
| | | | | | | | | | | | | With page combining, the assumption that number of SG entries in umem SGL equal to number of system pages in umem no longer holds. umem->sg_nents tracks the SG entries in umem SGL. Use it in sg_pcopy_to_buffer() as opposed to ib_umem_num_pages(umem). Fixes: d10bcf947a3e ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs") Reported-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Handle SRQ allocations by IB/coreLeon Romanovsky2019-04-083-43/+48
| | | | | | | Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Handle AH allocations by IB/coreLeon Romanovsky2019-04-082-17/+25
| | | | | | | | | | | | Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: When attrs.udata/ufile is available use that instead of uobjectJason Gunthorpe2019-04-085-8/+8
| | | | | | | The ucontext and ufile should not be accessed via the uobject, all these cases have an attrs so use that instead. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/nldev: Return device protocolLeon Romanovsky2019-04-081-1/+23
| | | | | | | | | | | Add new RDMA_NLDEV_ATTR_DEV_PROTOCOL attribute to give ability for UDEV rules create IB device stable names based on link type protocol. The assumption that devices like mlx4 with duality in their link type under one IB device struct won't be allowed in the future. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/cm: Move debug counters to be under relevant IB deviceLeon Romanovsky2019-04-083-38/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sysfs layout is created by CM incorrectly presented RDMA devices with InfiniBand link layer. Layout of such devices represents device tree of connections. By moving CM statistics to be under relevant port of IB device, we will fix the following issues: * Symlink name - It used device name instead of specific identifier. * Target location - It was supposed to point to PCI-ID/infiniband_cm/ instead of PCI-ID/infiniband/ * Target name - It created extra device file under already existing device folder, e.g. mlx5_0/mlx5_0 * Crash during boot with RDMA persistent naming patches. sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0' CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178 Call Trace: dump_stack+0xcc/0x180 sysfs_warn_dup.cold.3+0x17/0x2d sysfs_do_create_link_sd.isra.2+0xd0/0xf0 device_add+0x7cb/0x1450 device_create_groups_vargs+0x1ae/0x220 device_create+0x93/0xc0 cm_add_one+0x38f/0xf60 [ib_cm] add_client_context+0x167/0x210 [ib_core] enable_device_and_get+0x230/0x3f0 [ib_core] ib_register_device+0x823/0xbf0 [ib_core] __mlx5_ib_add+0x45/0x150 [mlx5_ib] mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib] mlx5_add_device+0x130/0x3a0 [mlx5_core] mlx5_register_interface+0x1a9/0x270 [mlx5_core] do_one_initcall+0x14f/0x5de do_init_module+0x247/0x7c0 load_module+0x4c2f/0x60d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe After this change: [leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/ drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_duplicates drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_retries Fixes: 110cf374a809 ("infiniband: make cm_device use a struct device and not a kobject.") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEsShiraz Saleem2019-04-083-22/+86
| | | | | | | | | | | | | | | | | | | | | | | | | Combine contiguous regions of PAGE_SIZE pages into single scatter list entry while building the scatter table for a umem. This minimizes the number of the entries in the scatter list and reduces the DMA mapping overhead, particularly with the IOMMU. Set default max_seg_size in core for IB devices to 2G and do not combine if we exceed this limit. Also, purge npages in struct ib_umem as we now DMA map the umem SGL with sg_nents and npage computation is not needed. Drivers should now be using ib_umem_num_pages(), so fix the last stragglers. Move npages tracking to ib_umem_odp as ODP drivers still need it. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Tested-by: Gal Pressman <galpress@amazon.com> Tested-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/cm: Remove useless zeroing of static global variableLeon Romanovsky2019-04-041-1/+0
| | | | | | | | Static global variables are initialized to zero by C standard, there is no need to zero them again. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/cma: Set proper port number as indexLeon Romanovsky2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conversion from IDR to XArray missed the fact that idr_alloc() returned index as a return value, this index was saved in port variable and used as query index later on. This caused to the following error. BUG: KASAN: use-after-free in cma_check_port+0x86a/0xa20 [rdma_cm] Read of size 8 at addr ffff888069fde998 by task ucmatose/387 CPU: 3 PID: 387 Comm: ucmatose Not tainted 5.1.0-rc2+ #253 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x7c/0xc0 print_address_description+0x6c/0x23c ? cma_check_port+0x86a/0xa20 [rdma_cm] kasan_report.cold.3+0x1c/0x35 ? cma_check_port+0x86a/0xa20 [rdma_cm] ? cma_check_port+0x86a/0xa20 [rdma_cm] cma_check_port+0x86a/0xa20 [rdma_cm] rdma_bind_addr+0x11bc/0x1b00 [rdma_cm] ? find_held_lock+0x33/0x1c0 ? cma_ndev_work_handler+0x180/0x180 [rdma_cm] ? wait_for_completion+0x3d0/0x3d0 ucma_bind+0x120/0x160 [rdma_ucm] ? ucma_resolve_addr+0x1a0/0x1a0 [rdma_ucm] ucma_write+0x1f8/0x2b0 [rdma_ucm] ? ucma_open+0x260/0x260 [rdma_ucm] vfs_write+0x157/0x460 ksys_write+0xb8/0x170 ? __ia32_sys_read+0xb0/0xb0 ? trace_hardirqs_off_caller+0x5b/0x160 ? do_syscall_64+0x18/0x3c0 do_syscall_64+0x95/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Allocated by task 381: __kasan_kmalloc.constprop.5+0xc1/0xd0 cma_alloc_port+0x4d/0x160 [rdma_cm] rdma_bind_addr+0x14e7/0x1b00 [rdma_cm] ucma_bind+0x120/0x160 [rdma_ucm] ucma_write+0x1f8/0x2b0 [rdma_ucm] vfs_write+0x157/0x460 ksys_write+0xb8/0x170 do_syscall_64+0x95/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 381: __kasan_slab_free+0x12e/0x180 kfree+0xed/0x290 rdma_destroy_id+0x6b6/0x9e0 [rdma_cm] ucma_close+0x110/0x300 [rdma_ucm] __fput+0x25a/0x740 task_work_run+0x10e/0x190 do_exit+0x85e/0x29e0 do_group_exit+0xf0/0x2e0 get_signal+0x2e0/0x17e0 do_signal+0x94/0x1570 exit_to_usermode_loop+0xfa/0x130 do_syscall_64+0x327/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: <syzbot+2e3e485d5697ea610460@syzkaller.appspotmail.com> Reported-by: Ran Rozenstein <ranro@mellanox.com> Fixes: 638267537ad9 ("cma: Convert portspace IDRs to XArray") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: Pass only ib_udata in function prototypesShamir Rabinovitch2019-04-014-11/+8
| | | | | | | | | | | Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: Remove 'uobject->context' dependency in object destroy APIsShamir Rabinovitch2019-04-014-11/+13
| | | | | | | | | | | Now that we have the udata passed to all the ib_xxx object destroy APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch2019-04-018-64/+73
| | | | | | | | | | The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: Pass uverbs_attr_bundle down uobject destroy pathShamir Rabinovitch2019-04-0110-67/+96
| | | | | | | | | Pass uverbs_attr_bundle down the uobject destroy path. The next patch will use this to eliminate the dependecy of the drivers in ib_x->uobject pointers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: ucontext should be set properly for all cmd & ioctl pathsShamir Rabinovitch2019-04-014-61/+28
| | | | | | | | | | | | | | the Attempt to use the below commit to initialize the ucontext for the uobject destroy path has shown that the below commit is incomplete. Parts were reverted and the ucontext set up in the uverbs_attr_bundle was moved to rdma_lookup_get_uobject which is called from the uobj_get_XXX macros and rdma_alloc_begin_uobject which is called when uobject is created. Fixes: 3d9dfd060391 ("IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Add command to set ib_core device net namspace sharing modeParav Pandit2019-03-283-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add netlink command that enables/disables sharing rdma device among multiple net namespaces. Using rdma tool, $rdma sys set netns shared (default mode) When rdma subsystem netns mode is set to shared mode, rdma devices will be accessible in all net namespaces. Using rdma tool, $rdma sys set netns exclusive When rdma subsystem netns mode is set to exclusive mode, devices will be accessible in only one net namespace at any given point of time. If there are any net namespaces other than default init_net exists, while executing this command, it will fail and mode cannot be changed. To change this mode, netlink command is used instead of sysctl, because netlink command allows to auto load a module. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Add interface to read device namespace sharing modeParav Pandit2019-03-283-1/+34
| | | | | | | | | | | | | Add an interface via netlink command to query whether rdma devices are shared among multiple net namespaces or not. When using RDMAtool, it can be queried as, $rdma system show netns netns shared Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Extend ib_device_get_by_index for net namespaceParav Pandit2019-03-283-11/+21
| | | | | | | | | | | | Extend ib_device_get_by_index() API to check device access for net namespace for serving netlink commands. Also enforce net ns check on dumpit commands which iterate over all registered rdma devices and which don't call ib_device_get_by_index(). Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Check net namespace access for uverbs, umad, cma and nldevParav Pandit2019-03-284-0/+38
| | | | | | | | | | | Introduce an API rdma_dev_access_netns() to check whether a rdma device can be accessed from the specified net namespace or not. Use rdma_dev_access_netns() while opening character uverbs, umad network device and also check while rdma cm_id binds to rdma device. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Add module param to disable device sharing among net nsParav Pandit2019-03-281-0/+7
| | | | | | | | | | | Add module parameter to change a sharing mode of ib_core early in the boot process. This parameter helps to those systems where modern up to date rdma tool (iproute2) package may not be available during kernel upgrade cycle. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Support core port attributes in non init_netParav Pandit2019-03-283-7/+18
| | | | | | | | | Now that sysfs compatibility layer for non init_net exists, add core port attributes such as pkey and gid table to non init_net ns. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Implement compat device/sysfs tree in net namespaceParav Pandit2019-03-281-4/+257
| | | | | | | | | | | | | | | | Implement compatibility layer sysfs entries of ib_core so that non init_net net namespaces can also discover rdma devices. Each non init_net net namespace has ib_core_device created in it. Such ib_core_device sysfs tree resembles rdma devices found in init_net namespace. This allows discovering rdma devices in multiple non init_net net namespaces via sysfs entries and helpful to rdma-core userspace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Restrict sysfs entries view to init_netParav Pandit2019-03-281-1/+11
| | | | | | | | | | | | | | | | | | | This is a preparation patch to provide isolation of rdma device in a network namespace. As first step, make rdma device visible only in init net namespace. Subsequent patch will enable rdma device visibility back in multiple net namespaces using compat ib_core_device device/sysfs tree. Given that the IB subsystem depends on net stack, it needs to be initialized after netdev and since it support devices, it needs to be initialized before the device subsystem; therefore, change initcall sequence to fs_initcall, so that when ib_core is compiled in the kernel image, the right init sequence is followed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Introduce ib_core_device to hold deviceParav Pandit2019-03-282-17/+36
| | | | | | | | | | | | | | | | | | | In order to support sysfs entries in multiple net namespaces for a rdma device, introduce a ib_core_device whose scope is limited to hold core device and per port sysfs related entries. This is preparation patch so that multiple ib_core_devices in each net namespace can be created in subsequent patch who all can share ib_device. (a) Move sysfs specific fields to ib_core_device. (b) Make sysfs and device life cycle related routines to work on ib_core_device. (c) Introduce and use rdma_init_coredev() helper to initialize coredev fields. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/uverbs: Allow the compiler to verify declaration and definition consistencyBart Van Assche2019-03-284-0/+4
| | | | | | | | | | | | | | This patch avoids that sparse reports the following warnings: drivers/infiniband/core/uverbs_std_types_flow_action.c:442:30: warning: symbol 'uverbs_def_obj_flow_action' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_dm.c:112:30: warning: symbol 'uverbs_def_obj_dm' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_counters.c:153:30: warning: symbol 'uverbs_def_obj_counters' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_mr.c:213:30: warning: symbol 'uverbs_def_obj_mr' was not declared. Should it be static? Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 0bd01f3d0907 ("RDMA/uverbs: Require all objects to have a driver destroy function") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/uverbs: Annotate uverbs_request_next_ptr() return value as a __user pointerBart Van Assche2019-03-281-1/+1
| | | | | | | | | | This patch avoids that sparse complains about a mismatch between the returned value and the function return type. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: c3bea3d2dc53 ("RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/uverbs: Add a __user annotation to a pointerBart Van Assche2019-03-281-1/+1
| | | | | | | | | | | This patch avoids that sparse and smatch report the following: warning: cast removes address space of expression Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 3a6532c9af1a ("RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add SMP details to MAD tracingIra Weiny2019-03-271-0/+8
| | | | | | | | Decode more information from the packet and include it in the trace. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/UMAD: Add umad trace pointsIra Weiny2019-03-271-0/+12
| | | | | | | | Trace MADs going to/from user space. Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add agent trace pointsIra Weiny2019-03-271-0/+4
| | | | | | | | | | Trace agent details when agents are [un]registered. In addition, report agent details on send/recv. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add recv path trace pointIra Weiny2019-03-271-0/+3
| | | | | | | | | Trace received MAD details. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add send path trace pointsIra Weiny2019-03-271-1/+32
| | | | | | | | | | | | Use the standard Linux trace mechanism to trace MADs being sent. 4 trace points are added, when the MAD is posted to the qp, when the MAD is completed, if a MAD is resent, and when the MAD completes in error. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Ensure an invalidate_range callback on ODP MRIra Weiny2019-03-262-10/+8
| | | | | | | | | | | | | | | | | | | | No device supports ODP MR without an invalidate_range callback. Warn on any any device which attempts to support ODP without supplying this callback. Then we can remove the checks for the callback within the code. This stems from the discussion https://www.spinics.net/lists/linux-rdma/msg76460.html ...which concluded this code was no longer necessary. Acked-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* cma: Convert portspace IDRs to XArrayMatthew Wilcox2019-03-261-20/+21
| | | | | Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* ucm: Convert ctx_id_table to XArrayMatthew Wilcox2019-03-261-22/+13
| | | | | Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* ib core: Convert query_idr to XArrayMatthew Wilcox2019-03-261-26/+18
| | | | | Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/cm: Convert local_id_table to XArrayMatthew Wilcox2019-03-261-24/+18
| | | | | | | | Also introduce cm_local_id() to reduce the amount of boilerplate when converting a local ID to an XArray index. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mad: Convert ib_mad_clients to XArrayMatthew Wilcox2019-03-261-25/+14
| | | | | | | | | Pull the allocation function out into its own function to reduce the length of ib_register_mad_agent() a little and keep all the allocation logic together. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Use __packed annotation instead of __attribute__ ((packed))Erez Alfasi2019-03-252-13/+13
| | | | | | | | | | | | | | | "__attribute__" set of macros has been standardized, have became more potentially portable and consistent code back in v2.6.21 by commit 82ddcb040 ("[PATCH] extend the set of "__attribute__" shortcut macros"). Moreover, nowadays checkpatch.pl warns about using __attribute__((packed)) instead of __packed. This patch converts all the "__attribute__ ((packed))" annotations to "__packed" within the RDMA subsystem. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* Merge tag 'xarray-5.1-rc1' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds2019-03-112-43/+14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull XArray updates from Matthew Wilcox: "This pull request changes the xa_alloc() API. I'm only aware of one subsystem that has started trying to use it, and we agree on the fixup as part of the merge. The xa_insert() error code also changed to match xa_alloc() (EEXIST to EBUSY), and I added xa_alloc_cyclic(). Beyond that, the usual bugfixes, optimisations and tweaking. I now have a git tree with all users of the radix tree and IDR converted over to the XArray that I'll be feeding to maintainers over the next few weeks" * tag 'xarray-5.1-rc1' of git://git.infradead.org/users/willy/linux-dax: XArray: Fix xa_reserve for 2-byte aligned entries XArray: Fix xa_erase of 2-byte aligned entries XArray: Use xa_cmpxchg to implement xa_reserve XArray: Fix xa_release in allocating arrays XArray: Mark xa_insert and xa_reserve as must_check XArray: Add cyclic allocation XArray: Redesign xa_alloc API XArray: Add support for 1s-based allocation XArray: Change xa_insert to return -EBUSY XArray: Update xa_erase family descriptions XArray tests: RCU lock prohibits GFP_KERNEL
* | RDMA/umem: Revert broken 'off by one' fixJohn Hubbard2019-03-061-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous attempted bug fix overlooked the fact that ib_umem_odp_map_dma_single_page() was doing a put_page() upon hitting an error. So there was not really a bug there. Therefore, this reverts the off-by-one change, but keeps the change to use release_pages() in the error path. Fixes: 75a3e6a3c129 ("RDMA/umem: minor bug fix in error handling path") Suggested-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/umem: minor bug fix in error handling pathJohn Hubbard2019-03-041-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Bug fix: fix an off by one error in the code that cleans up if it fails to dma-map a page, after having done a get_user_pages_remote() on a range of pages. 2. Refinement: for that same cleanup code, release_pages() is better than put_page() in a loop. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/uverbs: Don't do double free of allocated PDLeon Romanovsky2019-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | There is no need to call kfree(pd) because ib_dealloc_pd() internally frees PD. Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Handle ucontext allocations by IB/coreLeon Romanovsky2019-02-223-17/+17
| | | | | | | | | | | | | | Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Fix a WARN() messageDan Carpenter2019-02-221-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | The first parameter of WARN_ONCE() is a condition, then following parameters are the message. In this case, we left out the condition so it will just print the ops->type string. Fixes: 3856ec4b93c9 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/core: Abort page fault handler silently during owning process exitMoni Shoua2019-02-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | It is possible that during a page fault handling, the process that owns the MR is terminating. The indication for it is failure to get the task_struct or take reference on the mm_struct. In this case just abort the page-fault handler with error but without a warning to the kernel log. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/uverbs: Store PR pointer before it is overwrittenLeon Romanovsky2019-02-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IB_MR_REREG_PD command rewrites mr->pd after successful rereg_user_mr(), such change causes to lost usecnt information and produces the following warning: WARNING: CPU: 1 PID: 1771 at drivers/infiniband/core/verbs.c:336 ib_dealloc_pd+0x4e/0x60 [ib_core] CPU: 1 PID: 1771 Comm: rereg_mr Tainted: G W OE 5.0.0-rc7-for-upstream-perf-2019-02-20_14-03-40-34 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:ib_dealloc_pd+0x4e/0x60 [ib_core] RSP: 0018:ffffc90003923dc0 EFLAGS: 00010286 RAX: 00000000ffffffff RBX: ffff88821f7f0400 RCX: ffff888236a40c00 RDX: ffff88821f7f0400 RSI: 0000000000000001 RDI: 0000000000000000 RBP: 0000000000000001 R08: ffff88835f665d80 R09: ffff8882209c90d8 R10: ffff88835ec003e0 R11: 0000000000000000 R12: ffff888221680ba0 R13: ffff888221680b00 R14: 00000000ffffffea R15: ffff88821f53c318 FS: 00007f70db11e740(0000) GS:ffff88835f640000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000001dfd030 CR3: 000000029d9d8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: uverbs_free_pd+0x2d/0x30 [ib_uverbs] destroy_hw_idr_uobject+0x16/0x40 [ib_uverbs] uverbs_destroy_uobject+0x28/0x170 [ib_uverbs] __uverbs_cleanup_ufile+0x6b/0x90 [ib_uverbs] uverbs_destroy_ufile_hw+0x8b/0x110 [ib_uverbs] ib_uverbs_close+0x1f/0x80 [ib_uverbs] __fput+0xb1/0x220 task_work_run+0x7f/0xa0 exit_to_usermode_loop+0x6b/0xb2 do_syscall_64+0xc5/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f70dad00664 Fixes: e278173fd19e ("RDMA/core: Cosmetic change - move member initialization to correct block") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Verify that memory window type is legalNoa Osherovich2019-02-191-0/+5
| | | | | | | | | | | | | | | | | | Before calling the provider's alloc_mw function, verify that the given memory type is either IB_MW_TYPE_1 or IB_MW_TYPE_2. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
OpenPOWER on IntegriCloud