summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mlx5
Commit message (Collapse)AuthorAgeFilesLines
...
* | | RDMA: Fix return code check in rdma_set_cq_moderationKamal Heib2018-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The proper return code is "-EOPNOTSUPP" when the modify_cq() callback is not supported, all drivers should generate this and all users should check for it when detecting not supported functionality. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> (for mlx5) Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs languageJason Gunthorpe2018-07-302-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This clearly indicates that the input is a bitwise combination of values in an enum, and identifies which enum contains the definition of the bits. Special accessors are provided that handle the mandatory validation of the allowed bits and enforce the correct type for bitwise flags. If we had introduced this at the start then the kabi would have uniformly used u64 data to pass flags, however today there is a mixture of u64 and u32 flags. All places are converted to accept both sizes and the accessor fixes it. This allows all existing flags to grow to u64 in future without any hassle. Finally all flags are, by definition, optional. If flags are not passed the accessor does not fail, but provides a value of zero. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
* | | RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments constBart Van Assche2018-07-305-27/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5, ib_post_send(), IB_WR_REG_SIG_MR: Do not modify the 'wr' argumentBart Van Assche2018-07-301-12/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the next patch will constify the wr pointer, do not modify the data that pointer points at. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA: Constify the argument of the work request conversion functionsBart Van Assche2018-07-302-16/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: avoid excessive warning msgs when creating VFs on 2nd portQing Huang2018-07-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CX5 device is configured in dual-port RoCE mode, after creating many VFs against port 1, creating the same number of VFs against port 2 will flood kernel/syslog with something like "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already affiliated." So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port() repeatedly attempts to bind the new mpi structure to every device on the list until it finds an unbound device. Change the log level from warn to dbg to avoid log flooding as the warning should be harmless. Signed-off-by: Qing Huang <qing.huang@oracle.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/uverbs: Fix locking around struct ib_uverbs_file ucontextJason Gunthorpe2018-07-251-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a parallel unlocked reader and writer with ib_uverbs_get_context() vs everything else, and nothing guarantees this works properly. Audit and fix all of the places that access ucontext to use one of the following locking schemes: - Call ib_uverbs_get_ucontext() under SRCU and check for failure - Access the ucontext through an struct ib_uobject context member while holding a READ or WRITE lock on the uobject. This value cannot be NULL and has no race. - Hold the ucontext_lock and check for ufile->ucontext !NULL This also re-implements ib_uverbs_get_ucontext() in a way that is safe against concurrent ib_uverbs_get_context() and disassociation. As a side effect, every access to ucontext in the commands is via ib_uverbs_get_context() with an error check, or via the uobject, so there is no longer any need for the core code to check ucontext on every command call. These checks are also removed. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Use the ucontext from the uobj, not the fileJason Gunthorpe2018-07-251-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This approach matches the standard flow of the typical write method that relies on the HW object to store the device and the uobject to access the ucontext. Avoids the use of the devx_ufile2uctx in several places will make revising the semantics of ib_uverbs_get_ucontext() in the next patch simpler. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Enable driver uapi commands for flow steeringYishai Hadas2018-07-243-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the mlx5 flow steering parsing trees, exposing the functionality to user space. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for a flow table destination for driver flow steeringYishai Hadas2018-07-241-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to set a destination that is a flow table, this can come from the DEVX destination. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Support adding flow steering rule by raw descriptionYishai Hadas2018-07-242-17/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to set a public flow steering rule when its destination is a TIR by using raw specification data. The logic follows the verbs API but instead of using ib_spec(s) the raw, device specific, description is used. This allows supporting specialty matchers without having to define new matches in the verbs struct based language. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Introduce driver create and destroy flow methodsYishai Hadas2018-07-244-0/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce driver create and destroy flow methods on the uverbs flow object. This allows the driver to get its specific device attributes to match the underlay specification while still using the generic ib_flow object for cleanup and code sharing. The IB object's attributes are set via the ib_set_flow() helper function. The specific implementation for the given specification is added in downstream patches. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Introduce flow steering matcher uapi objectYishai Hadas2018-07-243-0/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce flow steering matcher object and its create and destroy methods. This matcher object holds some mlx5 specific driver properties that matches the underlay device specification when an mlx5 flow steering group is created. It will be used in downstream patches to be part of mlx5 specific create flow method. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | Merge branch 'mellanox/mlx5-next' into rdma.git for-nextJason Gunthorpe2018-07-241-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git This is required to resolve dependencies of the next series of RDMA patches. * branch 'mellanox/mlx5-next': net/mlx5: Add support for flow table destination number net/mlx5: Add forward compatible support for the FTE match data net/mlx5: Fix tristate and description for MLX5 module net/mlx5: Better return types for CQE API net/mlx5: Use ERR_CAST() instead of coding it net/mlx5: Add missing SET_DRIVER_VERSION command translation net/mlx5: Add XRQ commands definitions net/mlx5: Add core support for double vlan push/pop steering action net/mlx5: Expose MPEGC (Management PCIe General Configuration) structures net/mlx5: FW tracer, add hardware structures net/mlx5: fix uaccess beyond "count" in debugfs read/write handlers Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | net/mlx5: Fix tristate and description for MLX5 moduleEran Ben Elisha2018-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current description did not include new devices. Fix that by proving the correct generic description. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | RDMA/mlx5: Remove set but not used variablesKamal Heib2018-07-232-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove "uctx" and "pa" variables that were set but not used. Fixes: a8b92ca1b0e5 ("IB/mlx5: Introduce DEVX") Fixes: 8f0622873358 ("RDMA/mlx5: Remove debug prints of VMA pointers") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA/mlx5: Check that supplied blue flame index doesn't overflowLeon Romanovsky2018-07-132-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User's supplied index is checked again total number of system pages, but this number already includes num_static_sys_pages, so addition of that value to supplied index causes to below error while trying to access sys_pages[]. BUG: KASAN: slab-out-of-bounds in bfregn_to_uar_index+0x34f/0x400 Read of size 4 at addr ffff880065561904 by task syz-executor446/314 CPU: 0 PID: 314 Comm: syz-executor446 Not tainted 4.18.0-rc1+ #256 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xef/0x17e print_address_description+0x83/0x3b0 kasan_report+0x18d/0x4d0 bfregn_to_uar_index+0x34f/0x400 create_user_qp+0x272/0x227d create_qp_common+0x32eb/0x43e0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x22d0 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x433679 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff2b3d8e48 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433679 RDX: 0000000000000040 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000006d4018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040cb00 R14: 000000000040cb90 R15: 0000000000000006 Allocated by task 314: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x1a9/0x510 mlx5_ib_alloc_ucontext+0x966/0x2620 ib_uverbs_get_context+0x23f/0xa60 ib_uverbs_write+0xc2c/0x1010 __vfs_write+0x10d/0x720 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1: __kasan_slab_free+0x12e/0x180 kfree+0x159/0x630 kvfree+0x37/0x50 single_release+0x8e/0xf0 __fput+0x2d8/0x900 task_work_run+0x102/0x1f0 exit_to_usermode_loop+0x159/0x1c0 do_syscall_64+0x408/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff880065561100 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2052 bytes inside of 4096-byte region [ffff880065561100, ffff880065562100) The buggy address belongs to the page: page:ffffea0001955800 count:1 mapcount:0 mapping:ffff88006c402480 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffea0001a7c000 0000000200000002 ffff88006c402480 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880065561800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880065561880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880065561900: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880065561980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880065561a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: <stable@vger.kernel.org> # 4.15 Fixes: 1ee47ab3e8d8 ("IB/mlx5: Enable QP creation with a given blue flame index") Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA/mlx5: Melt consecutive calls to alloc_bfreg() in one callLeon Romanovsky2018-07-132-41/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need for three consecutive calls to alloc_bfreg(). It can be implemented with one function. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | IB: Enable uverbs_destroy_def_handler to be used by driversYishai Hadas2018-07-101-16/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable uverbs_destroy_def_handler to be used by drivers and replace current code to use it. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA: Validate grh_required when handling AVsArtemy Kovalyov2018-07-101-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the existing grh_required flag to check when AV's are handled that a GRH is present. Since we don't want to do query_port during the AV checks for performance reasons move the flag into the immutable_data. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA: Fix storage of PortInfo CapabilityMask in the kernelJason Gunthorpe2018-07-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The internal flag IP_BASED_GIDS was added to a field that was being used to hold the port Info CapabilityMask without considering the effects this will have. Since most drivers just use the value from the HW MAD it means IP_BASED_GIDS will also become set on any HW that sets the IBA flag IsOtherLocalChangesNoticeSupported - which is not intended. Fix this by keeping port_cap_flags only for the IBA CapabilityMask value and store unrelated flags externally. Move the bit definitions for this to ib_mad.h to make it clear what is happening. To keep the uAPI unchanged define a new set of flags in the uapi header that are only used by ib_uverbs_query_port_resp.port_cap_flags which match the current flags supported in rdma-core, and the values exposed by the current kernel. Fixes: b4a26a27287a ("IB: Report using RoCE IP based gids in port caps") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | IB/mlx5: Honor cnt_set_id_valid flag instead of set_idParav Pandit2018-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is incorrect to depend on set_id value to know if counters were allocated or not. set_id_valid field is set to true when counters were allocated. Therefore, use set_id_valid while deciding to free counters. Cc: <stable@vger.kernel.org> # 4.15 Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA/mlx5: Remove unused port number parameterLeon Romanovsky2018-07-091-20/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up a little bit code to drop unused port_num parameter. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | IB/mlx5: fix uaccess beyond "count" in debugfs read/write handlersJann Horn2018-07-092-32/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In general, accessing userspace memory beyond the length of the supplied buffer in VFS read/write handlers can lead to both kernel memory corruption (via kernel_read()/kernel_write(), which can e.g. be triggered via sys_splice()) and privilege escalation inside userspace. In this case, the affected files are in debugfs (and should therefore only be accessible to root), and the read handlers check that *pos is zero (meaning that at least sys_splice() can't trigger kernel memory corruption). Because of the root requirement, this is not a security fix, but rather a cleanup. For the read handlers, fix it by using simple_read_from_buffer() instead of custom logic. Add min() calls to the write handlers. Fixes: 4a2da0b8c078 ("IB/mlx5: Add debug control parameters for congestion control") Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | RDMA/uverbs: Use UVERBS_ATTR_MIN_SIZE correctly and uniformlyJason Gunthorpe2018-07-041-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This newer macro allows specifying a lower bound on the accepted size, and has an 'unlimited' upper bound. Due to this it never checks for trailing zeroing so it doesn't make any sense to combine it with MIN_SZ_OR_ZERO, so drop MIN_SZ_OR_ZERO when they are used together There were a couple of places that open coded this pattern, switch them to use the clearer UVERBS_ATTR_MIN_SIZE for clarity. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Remove UA_FLAGSJason Gunthorpe2018-07-042-37/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bit of boilerplate isn't really necessary, we can use bitfields instead of a flags enum and the macros can then individually initialize them through the __VA_ARGS__ like everything else. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Get rid of the & in method specificationsJason Gunthorpe2018-07-042-122/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hide it inside the macros. The & is confusing and interferes with using this as a generic DSL in later patches. Since this also touches almost every line, also run the specs through clang-format (with 'BinPackParameters: false') to make the maintenance easier. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Simplify UVERBS_OBJECT and _TREE family of macrosJason Gunthorpe2018-07-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of the large set of indirecting macros, define the few needed macros to directly instantiate the struct uverbs_oject_tree_def and associated objects list. This is small amount of code duplication but the readability is far better. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Simplify method definition macrosJason Gunthorpe2018-07-041-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of the large set of indirecting macros, define the few needed macros to directly instantiate the struct uverbs_method_def and associated attributes list. This is small amount of code duplication but the readability is far better. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Simplify UVERBS_ATTR family of macrosJason Gunthorpe2018-07-041-36/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a complex cascade of macros, just directly provide the initializer list each of the declarations is trying to create. Now that the macros are simplified this also reworks the uverbs_attr_spec to be friendly to older compilers by eliminating any unnamed structures/unions inside, and removing the duplication of some fields. The structure size remains at 16 bytes which was the original motivation for some of this oddness. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | RDMA/uverbs: Store the specs_root in the struct ib_uverbs_deviceJason Gunthorpe2018-07-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The specs are required to operate the uverbs file, so they belong inside the ib_uverbs_device, not inside the ib_device. The spec passed in the ib_device is just a communication from the driver and should not be used during runtime. This also changes the lifetime of the spec memory to match the ib_uverbs_device, however at this time the spec_root can still contain driver pointers after disassociation, so it cannot be used if ib_dev is NULL. This is preparation for another series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | | Merge branch 'mlx5-dump-fill-mkey' into rdma.git for-nextJason Gunthorpe2018-07-043-0/+32
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull Dump and fill MKEY from Leon Romanovsky: ==================== MLX5 IB HCA offers the memory key, dump_fill_mkey to increase performance, when used in a send or receive operations. It is used to force local HCA operations to skip the PCI bus access, while keeping track of the processed length in the ibv_sge handling. In this three patch series, we expose various bits in our HW spec file (mlx5_ifc.h), move unneeded for mlx5_core FW command and export such memory key to user space thought our mlx5-abi header file. ==================== Botched auto-merge in mlx5_ib_alloc_ucontext() resolved by hand. * branch 'mlx5-dump-fill-mkey': IB/mlx5: Expose dump and fill memory key net/mlx5: Add hardware definitions for dump_fill_mkey net/mlx5: Limit scope of dump_fill_mkey function net/mlx5: Rate limit errors in command interface Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | IB/mlx5: Expose dump and fill memory keyYonatan Cohen2018-07-041-0/+16
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MLX5 IB HCA offers the memory key, dump_fill_mkey to boost performance, when used in a send or receive operations. It is used to force local HCA operations to skip the PCI bus access, while keeping track of the processed length in the ibv_sge handling. Meaning, instead of a PCI write access the HCA leaves the target memory untouched, and skips filling that packet section. Similar behavior is done upon send, the HCA skips data in memory relevant to this key and saves PCI bus access. This functionality saves PCI read/write operations. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * / / net/mlx5: Limit scope of dump_fill_mkey functionYonatan Cohen2018-07-042-0/+16
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | mlx5_core_dump_fill_mkey() is going to be used in next patch in IB and doesn't need to be visible to whole mlx5_core. Move that command to mlx5_ib. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | IB/mlx5: Fix GRE flow specificationMaor Gottlieb2018-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the driver sets the mask of the gre_protocol to 0xffff without consideration in the user request. Fix it by copy the mask from the verbs spec. Fixes: da2f22ae7707 ("IB/mlx5: Add support for GRE flow specification") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Remove set-but-not-used variablesBart Van Assche2018-07-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid that the compiler complains about set-but-not-used variables when building with W=1. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB: Improve uverbs_cleanup_ucontext algorithmYishai Hadas2018-06-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve uverbs_cleanup_ucontext algorithm to work properly when the topology graph of the objects cannot be determined at compile time. This is the case with objects created via the devx interface in mlx5. Typically uverbs objects must be created in a strict topologically sorted order, so that LIFO ordering will generally cause them to be freed properly. There are only a few cases (eg memory windows) where objects can point to things out of the strict LIFO order. Instead of using an explicit ordering scheme where the HW destroy is not allowed to fail, go over the list multiple times and allow the destroy function to fail. If progress halts then a final, desperate, cleanup is done before leaking the memory. This indicates a driver bug. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Don't leak UARs in case of free failsLeon Romanovsky2018-06-291-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure in releasing one UAR doesn't mean that we can't continue to release rest of system pages, so don't return too early. As part of cleanup, there is no need to print warning if mlx5_cmd_free_uar() fails because such warning will be printed as part of mlx5_cmd_exec(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for drain SQ & RQYishai Hadas2018-06-253-7/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch follows the logic from ib_core but considers the internal device state upon executing the involved commands. Specifically, Upon internal error state modify QP to an error state can be assumed to be success as each in-progress WR going to be flushed in error in any case as expected by that modify command. In addition, As the drain should never fail the driver makes sure that post_send/recv will succeed even if the device is already in an internal error state. As such once the driver will supply the simulated/SW CQEs the CQE for the drain WR will be handled as well. In case of an internal error state the CQE for the drain WR may be completed as part of the main task that handled the error state or by the task that issued the drain WR. As the above depends on scheduling the code takes the relevant locks and actions to make sure that the completion handler for that WR will always be called after that the post_send/recv were issued but not in parallel to the other task that handles the error flow. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | Merge branch 'icrc-counter' into rdma.git for-nextJason Gunthorpe2018-06-224-3/+73
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull RoCE ICRC counters from Leon Romanovsky: ==================== This series exposes RoCE ICRC counter through existing RDMA hw_counters sysfs interface. The first patch has all HW definitions in mlx5_ifc.h file and second patch is the actual counter implementation. ==================== * branch 'icrc-counter': IB/mlx5: Support RoCE ICRC encapsulated error counter net/mlx5: Add RoCE RX ICRC encapsulated counter
| * | | IB/mlx5: Support RoCE ICRC encapsulated error counterTalat Batheesh2018-06-224-3/+73
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to query the counter that counts the RoCE packets with corrupted ICRC (Invariant Cyclic Redundancy Code). This counter will be under /sys/class/infiniband/<mlx5-dev>/ports/<port>/hw_counters/ rx_icrc_encapsulated - The number of RoCE packets with ICRC error. Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Refactor transport domain checksLeon Romanovsky2018-06-191-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | Put all relevant checks for transport domain in the mlx5_ib_alloc/dealloc_transport_domain functions. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Expose DEVX treeYishai Hadas2018-06-193-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | Expose DEVX tree to be used by upper layers. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add DEVX query EQN supportYishai Hadas2018-06-191-1/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return the matching device EQN for a given user vector number via the DEVX interface. Note: EQs are owned by the kernel and shared by all user processes. Basically, a user CQ can point to any EQ. The kernel doesn't enforce any such limitation today either. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add DEVX support for memory registrationYishai Hadas2018-06-191-1/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to register a memory with the firmware via the DEVX interface. The driver translates a given user address to ib_umem then it will register the physical addresses with the firmware and get a unique id for this registration to be used for this virtual address. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for DEVX query UARYishai Hadas2018-06-193-4/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return a device UAR index for a given user index via the DEVX interface. Security note: The hardware protection mechanism works like this: Each device object that is subject to UAR doorbells (QP/SQ/CQ) gets a UAR ID (called uar_page in the device specification manual) upon its creation. Then upon doorbell, hardware fetches the object context for which the doorbell was rang, and validates that the UAR through which the DB was rang matches the UAR ID of the object. If no match the doorbell is silently ignored by the hardware. Of course, the user cannot ring a doorbell on a UAR that was not mapped to it. Now in devx, as the devx kernel does not manipulate the QP/SQ/CQ command mailboxes (except tagging them with UID), we expose to the user its UAR ID, so it can embed it in these objects in the expected specification format. So the only thing the user can do is hurt itself by creating a QP/SQ/CQ with a UAR ID other than his, and then in this case other users may ring a doorbell on its objects. The consequence of that will be that another user can schedule a QP/SQ of the buggy user for execution (just insert it to the hardware schedule queue or arm its CQ for event generation), no further harm is expected. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add DEVX support for modify and query commandsYishai Hadas2018-06-191-2/+348
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support in DEVX for modify and query commands, the required lock is taken (i.e. READ/WRITE) by the KABI infrastructure accordingly. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add obj create and destroy functionalityYishai Hadas2018-06-191-3/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to create and destroy firmware objects via the DEVX interface. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for DEVX general commandYishai Hadas2018-06-191-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to run general firmware command via the DEVX interface. A command that works on some object (e.g. CQ, WQ, etc.) will be added in next patches while maintaining the required object lock. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Introduce DEVXYishai Hadas2018-06-194-3/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce DEVX to enable direct device commands in downstream patches from this series. In that mode of work the firmware manages the isolation between processes' resources and as such a DEVX user id is created and assigned to the given user context upon allocation request. A capability check is done to make sure that this feature is really supported by the firmware prior to creating the DEVX user id. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
OpenPOWER on IntegriCloud