summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mlx5/main.c
Commit message (Collapse)AuthorAgeFilesLines
...
| * | RDMA/mlx5: Enable attaching modify header to steering flowsMark Bloch2018-09-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | When creating a flow steering rule, allow the user to attach a modify header action. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | RDMA/mlx5: Add NIC TX steering supportMark Bloch2018-09-111-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like ingress steering, allow a user to create steering rules that match egress vport traffic. We expose the same number of priorities as the bypass (NIC RX) steering. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | Merge branch 'mlx5-flow-mutate' into rdma.git for-nextJason Gunthorpe2018-09-051-0/+3
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull Flow actions to mutate packets from Leon Romanovsky: ==================== This series exposes the ability to create flow actions which can mutate packet headers. We do that by exposing two new verbs: * modify header - can change existing packet headers. packet * reformat - can encapsulate or decapsulate a packet. Once created a flow action must be attached to a steering rule for it to take effect. The first 10 patches refactor mlx5_core code, rename internal structures to better reflect their operation and export needed functions so the RDMA side can allocate the action. The last 5 patches expose via the IOCTL infrastructure mlx5_ib methods which do the actual allocation of resources and return an handle to the user. A user of this API is expected to know how to work with the device's spec as the input to those function is HW depended. An example usage of the modify header action is routing, A user can create an action which edits the L2 header and decrease the TTL. An example usage of the packet reformat action is VXLAN encap/decap which is done by the HW. ==================== * branch 'mlx5-flow-mutate': RDMA/mlx5: Extend packet reformat verbs RDMA/mlx5: Add new flow action verb - packet reformat RDMA/uverbs: Add generic function to fill in flow action object RDMA/mlx5: Add a new flow action verb - modify header RDMA/uverbs: Add UVERBS_ATTR_CONST_IN to the specs language net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| | * | RDMA/mlx5: Add a new flow action verb - modify headerMark Bloch2018-09-051-0/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the ability to create a flow action which changes packet headers. The data passed from userspace should be modify header actions as defined by HW specification. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * / IB/mlx5: Change TX affinity assignment in RoCE LAG modeMajd Dibbiny2018-09-041-0/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the current code, the TX affinity is per RoCE device, which can cause unfairness between different contexts. e.g. if we open two contexts, and each open 10 QPs concurrently, all of the QPs of the first context might end up on the first port instead of distributed on the two ports as expected To overcome this unfairness between processes, we maintain per device TX affinity, and per process TX affinity. The allocation algorithm is as follow: 1. Hold two tx_port_affinity atomic variables, one per RoCE device and one per ucontext. Both initialized to 0. 2. In mlx5_ib_alloc_ucontext do: 2.1. ucontext.tx_port_affinity = device.tx_port_affinity 2.2. device.tx_port_affinity += 1 3. In modify QP INIT2RST: 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM 3.2. ucontext.tx_port_affinity += 1 Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | net/mlx5: Add a no-append flow insertion modePaul Blakey2018-10-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If no-append flag is set, we will add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. While here, move the has_flow_tag boolean indicator to be a flag too. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | net/mlx5: Use flow counter IDs and not the wrapping cache objectMark Bloch2018-10-171-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | RDMA/netdev: Hoist alloc_netdev_mqs out of the driverDenis Drozdov2018-10-101-15/+8
|/ | | | | | | | | | | | | | | | | netdev has several interfaces that expect to call alloc_netdev_mqs from the core code, with the driver only providing the arguments. This is incompatible with the rdma_netdev interface that returns the netdev directly. Thus re-organize the API used by ipoib so that the verbs core code calls alloc_netdev_mqs for the driver. This is done by allowing the drivers to provide the allocation parameters via a 'get_params' callback and then initializing an allocated netdev as a second step. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* Merge tag 'v4.18' into rdma.git for-nextJason Gunthorpe2018-08-161-16/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Resolve merge conflicts from the -rc cycle against the rdma.git tree: Conflicts: drivers/infiniband/core/uverbs_cmd.c - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next - Merge removal of file->ucontext in for-next with new code in -rc drivers/infiniband/core/uverbs_main.c - for-next removed code from ib_uverbs_write() that was modified in for-rc Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * Merge tag 'mlx5-fixes-2018-06-26' of ↵David S. Miller2018-06-281-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2018-06-26 Fixes for mlx5 core and netdev driver: Two fixes from Alex Vesker to address command interface issues - Race in command interface polling mode - Incorrect raw command length parsing From Shay Agroskin, Fix wrong size allocation for QoS ETC TC regitster. From Or Gerlitz and Eli Cohin, Address backward compatability issues for when Eswitch capability is not advertised for the PF host driver - Fix required capability for manipulating MPFS - E-Switch, Disallow vlan/spoofcheck setup if not being esw manager - Avoid dealing with vport IB/eth representors if not being e-switch manager - E-Switch, Avoid setup attempt if not being e-switch manager - Don't attempt to dereference the ppriv struct if not being eswitch manager ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * IB/mlx5: Avoid dealing with vport representors if not being e-switch managerOr Gerlitz2018-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. Fixes: b5ca15ad7e61 ('IB/mlx5: Add proper representors support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2018-06-211-15/+21
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma fixes from Jason Gunthorpe: "Here are eight fairly small fixes collected over the last two weeks. Regression and crashing bug fixes: - mlx4/5: Fixes for issues found from various checkers - A resource tracking and uverbs regression in the core code - qedr: NULL pointer regression found during testing - rxe: Various small bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/rxe: Fix missing completion for mem_reg work requests RDMA/core: Save kernel caller name when creating CQ using ib_create_cq() IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM IB/mlx5: Fix return value check in flow_counters_set_data() IB/mlx5: Fix memory leak in mlx5_ib_create_flow IB/rxe: avoid double kfree skb
| | * IB/mlx5: Fix return value check in flow_counters_set_data()weiyongjun (A)2018-06-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of error, the function mlx5_fc_create() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 3b3233fbf02e ("IB/mlx5: Add flow counters binding support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| | * IB/mlx5: Fix memory leak in mlx5_ib_create_flowGustavo A. R. Silva2018-06-111-13/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case memory resources for *ucmd* were allocated, release them before return. Addresses-Coverity-ID: 1469857 ("Resource leak") Fixes: 3b3233fbf02e ("IB/mlx5: Add flow counters binding support") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/uverbs: Have the core code create the uverbs_root_specJason Gunthorpe2018-08-101-29/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no reason for drivers to do this, the core code should take of everything. The drivers will provide their information from rodata to describe their modifications to the core's base uapi specification. The core uses this to build up the runtime uapi for each device. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
* | | RDMA/netdev: Use priv_destructor for netdev cleanupJason Gunthorpe2018-08-021-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the unregister_netdev flow for IPoIB no longer relies on external code we can now introduce the use of priv_destructor and needs_free_netdev. The rdma_netdev flow is switched to use the netdev common priv_destructor instead of the special free_rdma_netdev and the IPOIB ULP adjusted: - priv_destructor needs to switch to point to the ULP's destructor which will then call the rdma_ndev's in the right order - We need to be careful around the error unwind of register_netdev as it sometimes calls priv_destructor on failure - ULPs need to use ndo_init/uninit to ensure proper ordering of failures around register_netdev Switching to priv_destructor is a necessary pre-requisite to using the rtnl new_link mechanism. The VNIC user for rdma_netdev should also be revised, but that is left for another patch. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs languageJason Gunthorpe2018-07-301-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This clearly indicates that the input is a bitwise combination of values in an enum, and identifies which enum contains the definition of the bits. Special accessors are provided that handle the mandatory validation of the allowed bits and enforce the correct type for bitwise flags. If we had introduced this at the start then the kabi would have uniformly used u64 data to pass flags, however today there is a mixture of u64 and u32 flags. All places are converted to accept both sizes and the accessor fixes it. This allows all existing flags to grow to u64 in future without any hassle. Finally all flags are, by definition, optional. If flags are not passed the accessor does not fail, but provides a value of zero. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
* | | IB/mlx5: avoid excessive warning msgs when creating VFs on 2nd portQing Huang2018-07-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CX5 device is configured in dual-port RoCE mode, after creating many VFs against port 1, creating the same number of VFs against port 2 will flood kernel/syslog with something like "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already affiliated." So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port() repeatedly attempts to bind the new mpi structure to every device on the list until it finds an unbound device. Change the log level from warn to dbg to avoid log flooding as the warning should be harmless. Signed-off-by: Qing Huang <qing.huang@oracle.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Enable driver uapi commands for flow steeringYishai Hadas2018-07-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the mlx5 flow steering parsing trees, exposing the functionality to user space. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for a flow table destination for driver flow steeringYishai Hadas2018-07-241-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to set a destination that is a flow table, this can come from the DEVX destination. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Support adding flow steering rule by raw descriptionYishai Hadas2018-07-241-17/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to set a public flow steering rule when its destination is a TIR by using raw specification data. The logic follows the verbs API but instead of using ib_spec(s) the raw, device specific, description is used. This allows supporting specialty matchers without having to define new matches in the verbs struct based language. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Introduce driver create and destroy flow methodsYishai Hadas2018-07-241-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce driver create and destroy flow methods on the uverbs flow object. This allows the driver to get its specific device attributes to match the underlay specification while still using the generic ib_flow object for cleanup and code sharing. The IB object's attributes are set via the ib_set_flow() helper function. The specific implementation for the given specification is added in downstream patches. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Remove set but not used variablesKamal Heib2018-07-231-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove "uctx" and "pa" variables that were set but not used. Fixes: a8b92ca1b0e5 ("IB/mlx5: Introduce DEVX") Fixes: 8f0622873358 ("RDMA/mlx5: Remove debug prints of VMA pointers") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA: Validate grh_required when handling AVsArtemy Kovalyov2018-07-101-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the existing grh_required flag to check when AV's are handled that a GRH is present. Since we don't want to do query_port during the AV checks for performance reasons move the flag into the immutable_data. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA: Fix storage of PortInfo CapabilityMask in the kernelJason Gunthorpe2018-07-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The internal flag IP_BASED_GIDS was added to a field that was being used to hold the port Info CapabilityMask without considering the effects this will have. Since most drivers just use the value from the HW MAD it means IP_BASED_GIDS will also become set on any HW that sets the IBA flag IsOtherLocalChangesNoticeSupported - which is not intended. Fix this by keeping port_cap_flags only for the IBA CapabilityMask value and store unrelated flags externally. Move the bit definitions for this to ib_mad.h to make it clear what is happening. To keep the uAPI unchanged define a new set of flags in the uapi header that are only used by ib_uverbs_query_port_resp.port_cap_flags which match the current flags supported in rdma-core, and the values exposed by the current kernel. Fixes: b4a26a27287a ("IB: Report using RoCE IP based gids in port caps") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | IB/mlx5: Honor cnt_set_id_valid flag instead of set_idParav Pandit2018-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is incorrect to depend on set_id value to know if counters were allocated or not. set_id_valid field is set to true when counters were allocated. Therefore, use set_id_valid while deciding to free counters. Cc: <stable@vger.kernel.org> # 4.15 Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Remove unused port number parameterLeon Romanovsky2018-07-091-20/+11
| | | | | | | | | | | | | | | | | | | | | Clean up a little bit code to drop unused port_num parameter. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/uverbs: Remove UA_FLAGSJason Gunthorpe2018-07-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | This bit of boilerplate isn't really necessary, we can use bitfields instead of a flags enum and the macros can then individually initialize them through the __VA_ARGS__ like everything else. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | RDMA/uverbs: Get rid of the & in method specificationsJason Gunthorpe2018-07-041-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hide it inside the macros. The & is confusing and interferes with using this as a generic DSL in later patches. Since this also touches almost every line, also run the specs through clang-format (with 'BinPackParameters: false') to make the maintenance easier. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | RDMA/uverbs: Store the specs_root in the struct ib_uverbs_deviceJason Gunthorpe2018-07-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The specs are required to operate the uverbs file, so they belong inside the ib_uverbs_device, not inside the ib_device. The spec passed in the ib_device is just a communication from the driver and should not be used during runtime. This also changes the lifetime of the spec memory to match the ib_uverbs_device, however at this time the spec_root can still contain driver pointers after disassociation, so it cannot be used if ib_dev is NULL. This is preparation for another series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | Merge branch 'mlx5-dump-fill-mkey' into rdma.git for-nextJason Gunthorpe2018-07-041-0/+16
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull Dump and fill MKEY from Leon Romanovsky: ==================== MLX5 IB HCA offers the memory key, dump_fill_mkey to increase performance, when used in a send or receive operations. It is used to force local HCA operations to skip the PCI bus access, while keeping track of the processed length in the ibv_sge handling. In this three patch series, we expose various bits in our HW spec file (mlx5_ifc.h), move unneeded for mlx5_core FW command and export such memory key to user space thought our mlx5-abi header file. ==================== Botched auto-merge in mlx5_ib_alloc_ucontext() resolved by hand. * branch 'mlx5-dump-fill-mkey': IB/mlx5: Expose dump and fill memory key net/mlx5: Add hardware definitions for dump_fill_mkey net/mlx5: Limit scope of dump_fill_mkey function net/mlx5: Rate limit errors in command interface Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Expose dump and fill memory keyYonatan Cohen2018-07-041-0/+16
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MLX5 IB HCA offers the memory key, dump_fill_mkey to boost performance, when used in a send or receive operations. It is used to force local HCA operations to skip the PCI bus access, while keeping track of the processed length in the ibv_sge handling. Meaning, instead of a PCI write access the HCA leaves the target memory untouched, and skips filling that packet section. Similar behavior is done upon send, the HCA skips data in memory relevant to this key and saves PCI bus access. This functionality saves PCI read/write operations. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Fix GRE flow specificationMaor Gottlieb2018-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the driver sets the mask of the gre_protocol to 0xffff without consideration in the user request. Fix it by copy the mask from the verbs spec. Fixes: da2f22ae7707 ("IB/mlx5: Add support for GRE flow specification") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Remove set-but-not-used variablesBart Van Assche2018-07-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid that the compiler complains about set-but-not-used variables when building with W=1. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Don't leak UARs in case of free failsLeon Romanovsky2018-06-291-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure in releasing one UAR doesn't mean that we can't continue to release rest of system pages, so don't return too early. As part of cleanup, there is no need to print warning if mlx5_cmd_free_uar() fails because such warning will be printed as part of mlx5_cmd_exec(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Add support for drain SQ & RQYishai Hadas2018-06-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch follows the logic from ib_core but considers the internal device state upon executing the involved commands. Specifically, Upon internal error state modify QP to an error state can be assumed to be success as each in-progress WR going to be flushed in error in any case as expected by that modify command. In addition, As the drain should never fail the driver makes sure that post_send/recv will succeed even if the device is already in an internal error state. As such once the driver will supply the simulated/SW CQEs the CQE for the drain WR will be handled as well. In case of an internal error state the CQE for the drain WR may be completed as part of the main task that handled the error state or by the task that issued the drain WR. As the above depends on scheduling the code takes the relevant locks and actions to make sure that the completion handler for that WR will always be called after that the post_send/recv were issued but not in parallel to the other task that handles the error flow. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | Merge branch 'icrc-counter' into rdma.git for-nextJason Gunthorpe2018-06-221-3/+59
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull RoCE ICRC counters from Leon Romanovsky: ==================== This series exposes RoCE ICRC counter through existing RDMA hw_counters sysfs interface. The first patch has all HW definitions in mlx5_ifc.h file and second patch is the actual counter implementation. ==================== * branch 'icrc-counter': IB/mlx5: Support RoCE ICRC encapsulated error counter net/mlx5: Add RoCE RX ICRC encapsulated counter
| * | | IB/mlx5: Support RoCE ICRC encapsulated error counterTalat Batheesh2018-06-221-3/+59
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to query the counter that counts the RoCE packets with corrupted ICRC (Invariant Cyclic Redundancy Code). This counter will be under /sys/class/infiniband/<mlx5-dev>/ports/<port>/hw_counters/ rx_icrc_encapsulated - The number of RoCE packets with ICRC error. Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA/mlx5: Refactor transport domain checksLeon Romanovsky2018-06-191-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | Put all relevant checks for transport domain in the mlx5_ib_alloc/dealloc_transport_domain functions. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Expose DEVX treeYishai Hadas2018-06-191-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | Expose DEVX tree to be used by upper layers. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/mlx5: Introduce DEVXYishai Hadas2018-06-191-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce DEVX to enable direct device commands in downstream patches from this series. In that mode of work the firmware manages the isolation between processes' resources and as such a DEVX user id is created and assigned to the given user context upon allocation request. A capability check is done to make sure that this feature is really supported by the firmware prior to creating the DEVX user id. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | IB/core: add max_send_sge and max_recv_sge attributesSteve Wise2018-06-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the ib_device_attr.max_sge with max_send_sge and max_recv_sge. It allows ulps to take advantage of devices that have very different send and recv sge depths. For example cxgb4 has a max_recv_sge of 4, yet a max_send_sge of 16. Splitting out these attributes allows much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW API. Consider a large RDMA WRITE that has 16 scattergather entries. With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of 16, it can be done with 1 WRITE WR. Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | RDMA: Convert drivers to use sgid_attr instead of sgid_indexParav Pandit2018-06-181-29/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core code now ensures that all driver callbacks that receive an rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present. Drivers can use this pointer instead of calling a query function with sgid_index. This simplifies the drivers and also avoids races where a gid_index lookup may return different data if it is changed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | RDMA: Use GID from the ib_gid_attr during the add_gid() callbackParav Pandit2018-06-181-3/+2
|/ / | | | | | | | | | | | | | | | | | | Now that ib_gid_attr contains the GID, make use of that in the add_gid() callback functions for the provider drivers to simplify the add_gid() implementations. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2018-06-071-70/+441
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This has been a quiet cycle for RDMA, the big bulk is the usual smallish driver updates and bug fixes. About four new uAPI related things. Not as much Szykaller patches this time, the bugs it finds are getting harder to fix. Summary: - More work cleaning up the RDMA CM code - Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns, i40iw, iw_cxgb4, mlx5, rxe - Driver specific resource tracking and reporting via netlink - Continued work for name space support from Parav - MPLS support for the verbs flow steering uAPI - A few tricky IPoIB fixes improving robustness - HFI1 driver support for the '16B' management packet format - Some auditing to not print kernel pointers via %llx or similar - Mark the entire 'UCM' user-space interface as BROKEN with the intent to remove it entirely. The user space side of this was long ago replaced with RDMA-CM and syzkaller is finding bugs in the residual UCM interface nobody wishes to fix because nobody uses it. - Purge more bogus BUG_ON's from Leon - 'flow counters' verbs uAPI - T10 fixups for iser/isert, these are Acked by Martin but going through the RDMA tree due to dependencies" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (138 commits) RDMA/mlx5: Update SPDX tags to show proper license RDMA/restrack: Change SPDX tag to properly reflect license IB/hfi1: Fix comment on default hdr entry size IB/hfi1: Rename exp_lock to exp_mutex IB/hfi1: Add bypass register defines and replace blind constants IB/hfi1: Remove unused variable IB/hfi1: Ensure VL index is within bounds IB/hfi1: Fix user context tail allocation for DMA_RTAIL IB/hns: Use zeroing memory allocator instead of allocator/memset infiniband: fix a possible use-after-free bug iw_cxgb4: add INFINIBAND_ADDR_TRANS dependency IB/isert: use T10-PI check mask definitions from core layer IB/iser: use T10-PI check mask definitions from core layer RDMA/core: introduce check masks for T10-PI offload IB/isert: fix T10-pi check mask setting IB/mlx5: Add counters read support IB/mlx5: Add flow counters read support IB/mlx5: Add flow counters binding support IB/mlx5: Add counters create and destroy support IB/uverbs: Add support for flow counters ...
| * Merge tag 'verbs_flow_counters' of ↵Jason Gunthorpe2018-06-041-12/+293
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git into for-next Pull verbs counters series from Leon Romanovsky: ==================== Verbs flow counters support This series comes to allow user space applications to monitor real time traffic activity and events of the verbs objects it manages, e.g.: ibv_qp, ibv_wq, ibv_flow. The API enables generic counters creation and define mapping to association with a verbs object, the current mlx5 driver is using this API for flow counters. With this API, an application can monitor the entire life cycle of object activity, defined here as a static counters attachment. This API also allows dynamic counters monitoring of measurement points for a partial period in the verbs object life cycle. In addition it presents the implementation of the generic counters interface. This will be achieved by extending flow creation by adding a new flow count specification type which allows the user to associate a previously created flow counters using the generic verbs counters interface to the created flow, once associated the user could read statistics by using the read function of the generic counters interface. The API includes: 1. create and destroyed API of a new counters objects 2. read the counters values from HW Note: Attaching API to allow application to define the measurement points per objects is a user space only API and this data is passed to kernel when the counted object (e.g. flow) is created with the counters object. =================== * tag 'verbs_flow_counters': IB/mlx5: Add counters read support IB/mlx5: Add flow counters read support IB/mlx5: Add flow counters binding support IB/mlx5: Add counters create and destroy support IB/uverbs: Add support for flow counters IB/core: Add support for flow counters IB/core: Support passing uhw for create_flow IB/uverbs: Add read counters support IB/core: Introduce counters read verb IB/uverbs: Add create/destroy counters support IB/core: Introduce counters object and its create/destroy IB/uverbs: Add an ib_uobject getter to ioctl() infrastructure net/mlx5: Export flow counter related API net/mlx5: Use flow counter pointer as input to the query function
| | * IB/mlx5: Add counters read supportRaed Salem2018-06-021-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the uverbs counters read API, it will use the specific read counters function to the given type to accomplish its task. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| | * IB/mlx5: Add flow counters read supportRaed Salem2018-06-021-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Implements the flow counters read wrapper. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| | * IB/mlx5: Add flow counters binding supportRaed Salem2018-06-021-13/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Associates a counters with a flow when IB_FLOW_SPEC_ACTION_COUNT is part of the flow specifications. The counters user space placements of location and description (index, description) pairs are passed as private data of the counters flow specification. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| | * IB/mlx5: Add counters create and destroy supportRaed Salem2018-06-021-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the device counters create and destroy APIs and introducing some internal management structures. Downstream patches in this series will add the functionality to support flow counters binding and reading. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
OpenPOWER on IntegriCloud