summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
Commit message (Collapse)AuthorAgeFilesLines
* iw_cxgb4: invalidate the mr when posting a read_w_inv wrSteve Wise2016-11-161-1/+1
| | | | | | | | | Also, rearrange things a bit to have a common c4iw_invalidate_mr() function used everywhere that we need to invalidate. Fixes: 49b53a93a64a ("iw_cxgb4: add fast-path for small REG_MR operations") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge tag 'for-linus' of ↵Linus Torvalds2016-10-091-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull main rdma updates from Doug Ledford: "This is the main pull request for the rdma stack this release. The code has been through 0day and I had it tagged for linux-next testing for a couple days. Summary: - updates to mlx5 - updates to mlx4 (two conflicts, both minor and easily resolved) - updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - improvements to uAPI (moved vendor specific API elements to uAPI area) - add hns-roce driver and hns and hns-roce ACPI reset support - conversion of all rdma code away from deprecated create_singlethread_workqueue - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits) staging/lustre: Disable InfiniBand support iw_cxgb4: add fast-path for small REG_MR operations cxgb4: advertise support for FR_NSMR_TPTE_WR IB/core: correctly handle rdma_rw_init_mrs() failure IB/srp: Fix infinite loop when FMR sg[0].offset != 0 IB/srp: Remove an unused argument IB/core: Improve ib_map_mr_sg() documentation IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets IB/mthca: Move user vendor structures IB/nes: Move user vendor structures IB/ocrdma: Move user vendor structures IB/mlx4: Move user vendor structures IB/cxgb4: Move user vendor structures IB/cxgb3: Move user vendor structures IB/mlx5: Move and decouple user vendor structures IB/{core,hw}: Add constant for node_desc ipoib: Make ipoib_warn ratelimited IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue IB/ipoib: Remove deprecated create_singlethread_workqueue ...
| * IB/cxgb4: Move user vendor structuresLeon Romanovsky2016-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves cxgb4 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libcxgb4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-09-231-0/+1
|\ \
| * \ Merge branch 'nvmf-4.8-rc' of git://git.infradead.org/nvme-fabrics into ↵Jens Axboe2016-09-131-0/+1
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | for-linus Sagi writes: Here we have: - Kconfig dependencies fix from Arnd - nvme-rdma device removal fixes from Steve - possible bad deref fix from Colin
| | * iw_cxgb4: block module unload until all ep resources are releasedSteve Wise2016-09-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise an endpoint can be still closing down causing a touch after free crash. Also WARN_ON if ulps have failed to destroy various resources during device removal. Fixes: ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Reviewed-by: Sagi Grimberg <sagi@grimbrg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
* | | libcxgb,iw_cxgb4,cxgbit: add cxgb_compute_wscale()Varun Prakash2016-09-151-9/+0
|/ / | | | | | | | | | | | | | | | | Add cxgb_compute_wscale() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: don't block in destroy_qp awaiting the last derefSteve Wise2016-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | Blocking in c4iw_destroy_qp() causes a deadlock when apps destroy a qp or disconnect a cm_id from their cm event handler function. There is no need to block here anyway, so just replace the refcnt atomic with a kref object and free the memory on the last put. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/iw_cxgb4: Low resource fixes for Completion queueHariprasad S2016-06-231-0/+1
| | | | | | | | | | | | | | | | | | | | Pre-allocate buffers to deallocate completion queue, so that completion queue is deallocated during RDMA termination when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/iw_cxgb4: Low resource fixes for Memory registrationHariprasad S2016-06-231-0/+2
| | | | | | | | | | | | | | | | | | Pre-allocate buffers for deregistering memory region and memory window during RDMA connection close, when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/iw_cxgb4: Low resource fixes for connection managerHariprasad S2016-06-231-0/+19
|/ | | | | | | | | | Pre-allocate buffers for sending various control messages to close connection, abort connection, etc so that we gracefully handle connections when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
*-. Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into ↵Doug Ledford2016-05-131-1/+8
|\ \ | | | | | | | | | k.o/for-4.7
| * | RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiationHariprasad S2016-05-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ->Stop the ep timer after MPA negotiation so that the arp failures during send_mpa_reply/reject will be handled by process_timeout() after the ep timer expires. ->Added case MPA_REP_SENT in process_timeout(). ->For MPA reject, c4iw_ep_disconnect tries to start an already started timer, which leads to warning message "timer already started". -> In case of mpa reject stop the timer and call send_mpa_reject(). -> Added new ep flag STOP_MPA_TIMER to tell fw4_ack() to stop the timer only for send_mpa_reply(), which is set in c4iw_accept_cr(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/iw_cxgb4: Add few history bits for epHariprasad S2016-05-131-1/+7
| |/ | | | | | | | | | | | | | | | | | | | | - add EP_DISC_FAIL history bit - add QP_REFED/DEREFED history bits - Add functions to ref/deref the cm_id and add history bit for the same - add CLOSE_CON_RPL history Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Enhance ib_map_mr_sg()Bart Van Assche2016-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SRP initiator allows to set max_sectors to a value that exceeds the largest amount of data that can be mapped at once with an mlx4 HCA using fast registration and a page size of 4 KB. Hence modify ib_map_mr_sg() such that it can map partial sg-elements. If an sg-element has been mapped partially, let the caller know which fraction has been mapped by adjusting *sg_offset. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Laurence Oberman <loberman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Add passing an offset into the SG to ib_map_mr_sgChristoph Hellwig2016-05-131-3/+2
|/ | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
*-. Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6Doug Ledford2016-03-161-42/+0
|\ \
| | * iw_cxgb4: remove port mapper related codeSteve Wise2016-03-161-42/+0
| |/ | | | | | | | | | | | | | | Now that most of the port mapper code been moved to iwcm, we can remove it from iw_cxgb4. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| |
| \
*-. \ Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6Doug Ledford2016-03-161-1/+2
|\ \ \ | | |/ | |/|
| | * IB/core: Add vendor's specific data to alloc mwMatan Barak2016-03-011-1/+2
| |/ | | | | | | | | | | | | | | | | | | Passing udata to the vendor's driver in order to pass data from the user-space driver to the kernel-space driver. This data will be used in downstream patches. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | iw_cxgb4: add queue drain functionsSteve Wise2016-02-291-0/+4
|/ | | | | | | | | | | | | Add completion objects, named sq_drained and rq_drained, to the c4iw_qp struct. The queue-specific completion object is signaled when the last CQE is drained from the CQ for that queue. Add c4iw_drain_sq() to block until qp->rq_drained is completed. Add c4iw_drain_rq() to block until qp->sq_drained is completed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB: remove in-kernel support for memory windowsChristoph Hellwig2015-12-231-2/+0
| | | | | | | | | | | | Remove the unused ib_allow_mw and ib_bind_mw functions, remove the unused IB_WR_BIND_MW and IB_WC_BIND_MW opcodes and move ib_dealloc_mw into the uverbs module. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB: remove support for phys MRsChristoph Hellwig2015-12-231-11/+0
| | | | | | | | | | | | We have stopped using phys MRs in the kernel a while ago, so let's remove all the cruft used to implement them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-By: Devesh Sharma<devesh.sharma@avagotech.com> [ocrdma] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* iw_cxgb4: Remove old FRWR APISagi Grimberg2015-10-281-18/+0
| | | | | | | | No ULP uses it anymore, go ahead and remove it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* iw_cxgb4: Support the new memory registration APISagi Grimberg2015-10-281-0/+7
| | | | | | | | | | | | | | | | | | | Support the new memory registration API by allocating a private page list array in c4iw_mr and populate it when c4iw_map_mr_sg is invoked. Also, support IB_WR_REG_MR by duplicating build_fastreg just take the needed information from different places: - page_size, iova, length (ib_mr) - page array (c4iw_mr) - key, access flags (ib_reg_wr) The IB_WR_FAST_REG_MR handlers will be removed later when all the ULPs will be converted. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* iw_cxgb4: Support ib_alloc_mr verbSagi Grimberg2015-08-301-1/+3
| | | | | Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Change provider's API of create_cq to be extendibleMatan Barak2015-06-121-4/+4
| | | | | | | | | | | | | | | Add a new ib_cq_init_attr structure which contains the previous cqe (minimum number of CQ entries) and comp_vector (completion vector) in addition to a new flags field. All vendors' create_cq callbacks are changed in order to work with the new API. This commit does not change any functionality. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2 Signed-off-by: Doug Ledford <dledford@redhat.com>
* iw_cxgb4: support for bar2 qid densities exceeding the page sizeHariprasad S2015-06-111-2/+3
| | | | | | | | | | | | | | | Handle this configuration: Queues Per Page * SGE BAR2 Queue Register Area Size > Page Size Use cxgb4_bar2_sge_qregs() to obtain the proper location within the bar2 region for a given qid. Rework the DB and GTS write functions to make use of this bar2 info. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* iw_cxgb4: Remove negative advice dmesg warningsHariprasad S2015-05-051-0/+7
| | | | | | | | | Remove these log messages in favor of per-endpoint counters as well as device-global counters that can be inspected via debugfs. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/cxgb4: Don't hang threads forever waiting on WR repliesHariprasad S2015-02-181-15/+14
| | | | | | | | | | | | | In c4iw_wait_for_reply(), if a FW6_MSG WR reply is not received after C4IW_WR_TO seconds, fail the WR operation and mark the device as fatally dead. Further, if the device is marked fatally dead, then fail the WR wait immediately. Also change the timeout to 60 seconds. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-07-221-1/+1
|\ | | | | | | | | | | | | | | | | Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * RDMA/cxgb4: Call iwpm_init() only onceSteve Wise2014-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | We need to only register with the iwpm core once. Currently it is being done for every adapter, which causes a failure for each adapter but the first, making multiple adapters unusable. Fixes: 9eccfe109b27 ("RDMA/cxgb4: Add support for iWARP Port Mapper user space service") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | iw_cxgb4: Don't limit TPTE count to 32KBHariprasad Shenai2014-07-211-1/+1
| | | | | | | | | | | | | | | | Use the size advertised by FW Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: work request logging featureHariprasad Shenai2014-07-151-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enhances the iwarp driver to optionally keep a log of rdma work request timining data for kernel mode QPs. If iw_cxgb4 module option c4iw_wr_log is set to non-zero, each work request is tracked and timing data maintained in a rolling log that is 4096 entries deep by default. Module option c4iw_wr_log_size_order allows specifing a log2 size to use instead of the default order of 12 (4096 entries). Both module options are read-only and must be passed in at module load time to set them. IE: modprobe iw_cxgb4 c4iw_wr_log=1 c4iw_wr_log_size_order=10 The timing data is viewable via the iw_cxgb4 debugfs file "wr_log". Writing anything to this file will clear all the timing data. Data tracked includes: - The host time when the work request was posted, just before ringing the doorbell. The host time when the completion was polled by the application. This is also the time the log entry is created. The delta of these two times is the amount of time took processing the work request. - The qid of the EQ used to post the work request. - The work request opcode. - The cqe wr_id field. For sq completions requests this is the swsqe index. For recv completions this is the MSN of the ingress SEND. This value can be used to match log entries from this log with firmware flowc event entries. - The sge timestamp value just before ringing the doorbell when posting, the sge timestamp value just after polling the completion, and CQE.timestamp field from the completion itself. With these three timestamps we can track the latency from post to poll, and the amount of time the completion resided in the CQ before being reaped by the application. With debug firmware, the sge timestamp is also logged by firmware in its flowc history so that we can compute the latency from posting the work request until the firmware sees it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: use firmware ord/ird resource limitsHariprasad Shenai2014-07-151-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Advertise a larger max read queue depth for qps, and gather the resource limits from fw and use them to avoid exhaustinq all the resources. Design: cxgb4: Obtain the max_ordird_qp and max_ird_adapter device params from FW at init time and pass them up to the ULDs when they attach. If these parameters are not available, due to older firmware, then hard-code the values based on the known values for older firmware. iw_cxgb4: Fix the c4iw_query_device() to report these correct values based on adapter parameters. ibv_query_device() will always return: max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp) max_res_rd_atom = max_ird_adapter Bump up the per qp max module option to 32, allowing it to be increased by the user up to the device max of max_ordird_qp. 32 seems to be sufficient to maximize throughput for streaming read benchmarks. Fail connection setup if the negotiated IRD exhausts the available adapter ird resources. So the driver will track the amount of ird resource in use and not send an RI_WR/INIT to FW that would reduce the available ird resources below zero. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: Detect Ing. Padding Boundary at run-timeHariprasad Shenai2014-07-151-0/+12
|/ | | | | | | | | Updates iw_cxgb4 to determine the Ingress Padding Boundary from cxgb4_lld_info, and take subsequent actions. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-06-121-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
| * iw_cxgb4: don't truncate the recv window sizeHariprasad Shenai2014-06-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | Fixed a bug that shows up with recv window sizes that exceed the size of the RCV_BUFSIZ field in opt0 (>= 1024K). If the recv window exceeds this, then we specify the max possible in opt0, add add the rest in via a RX_DATA_ACK credits. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDMA/cxgb4: Add support for iWARP Port Mapper user space serviceSteve Wise2014-06-101-0/+44
|/ | | | | | | | | | Based on original work by Vipul Pandya. Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fix htons -> ntohs to make sparse happy. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Fix endpoint mutex deadlocksSteve Wise2014-04-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In cases where the cm calls c4iw_modify_rc_qp() with the endpoint mutex held, they must be called with internal == 1. rx_data() and process_mpa_reply() are not doing this. This causes a deadlock because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some !internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex. The design was intended to only do the disconnect for !internal calls. Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with internal == 1, and then disconnect only after releasing the mutex. Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with internal == 1 and set a new attr flag telling it to send a TERMINATE message. Previously this was implied by !internal. Change process_mpa_reply() to return whether the caller should disconnect after releasing the endpoint mutex. Now rx_data() will do the disconnect in the cases where process_mpa_reply() wants to disconnect after the TERMINATE is sent. Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal, and to send a TERMINATE message if attrs->send_term is 1. Change abort_connection() to not aquire the ep mutex for setting the state, and make all calls to abort_connection() do so with the mutex held. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devicesSteve Wise2014-04-111-0/+2
| | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fix cast from u64* to integer. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge tag 'rdma-for-linus' of ↵Linus Torvalds2014-04-031-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (102 commits) RDMA/ocrdma: Unregister inet notifier when unloading ocrdma RDMA/ocrdma: Fix warnings about pointer <-> integer casts RDMA/ocrdma: Code clean-up RDMA/ocrdma: Display FW version RDMA/ocrdma: Query controller information RDMA/ocrdma: Support non-embedded mailbox commands RDMA/ocrdma: Handle CQ overrun error RDMA/ocrdma: Display proper value for max_mw RDMA/ocrdma: Use non-zero tag in SRQ posting RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr() RDMA/ocrdma: Increment abi version count RDMA/ocrdma: Update version string be2net: Add abi version between be2net and ocrdma RDMA/ocrdma: ABI versioning between ocrdma and be2net RDMA/ocrdma: Allow DPP QP creation RDMA/ocrdma: Read ASIC_ID register to select asic_gen RDMA/ocrdma: SQ and RQ doorbell offset clean up RDMA/ocrdma: EQ full catastrophe avoidance RDMA/cxgb4: Disable DSGL use by default RDMA/cxgb4: rx_data() needs to hold the ep mutex ...
| * RDMA/cxgb4: Save the correct map length for fast_reg_page_listsSteve Wise2014-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | We cannot save the mapped length using the rdma max_page_list_len field of the ib_fast_reg_page_list struct because the core code uses it. This results in an incorrect unmap of the page list in c4iw_free_fastreg_pbl(). I found this with dma mapping debugging enabled in the kernel. The fix is to save the length in the c4iw_fr_page_list struct. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: Mind the sq_sig_all/sq_sig_type QP attributesSteve Wise2014-03-201-0/+1
| | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug FixesSteve Wise2014-03-141-2/+7
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic suffers from a slow response time to disable user DB usage, and also fails to avoid DB FIFO drops under heavy load. This commit fixes these deficiencies and makes the avoidance logic more optimal. This is done by more efficiently notifying the ULDs of potential DB problems, and implements a smoother flow control algorithm in iw_cxgb4, which is the ULD that puts the most load on the DB fifo. Design: cxgb4: Direct ULD callback from the DB FULL/DROP interrupt handler. This allows the ULD to stop doing user DB writes as quickly as possible. While user DB usage is disabled, the LLD will accumulate DB write events for its queues. Then once DB usage is reenabled, a single DB write is done for each queue with its accumulated write count. This reduces the load put on the DB fifo when reenabling. iw_cxgb4: Instead of marking each qp to indicate DB writes are disabled, we create a device-global status page that each user process maps. This allows iw_cxgb4 to only set this single bit to disable all DB writes for all user QPs vs traversing the idr of all the active QPs. If the libcxgb4 doesn't support this, then we fall back to the old approach of marking each QP. Thus we allow the new driver to work with an older libcxgb4. When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes via the status page and transition the DB state to STOPPED. As user processes see that DB writes are disabled, they call into iw_cxgb4 to submit their DB write events. Since the DB state is in STOPPED, the QP trying to write gets enqueued on a new DB "flow control" list. As subsequent DB writes are submitted for this flow controlled QP, the amount of writes are accumulated for each QP on the flow control list. So all the user QPs that are actively ringing the DB get put on this list and the number of writes they request are accumulated. When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq context, we change the DB state to FLOW_CONTROL, and begin resuming all the QPs that are on the flow control list. This logic runs on until the flow control list is empty or we exit FLOW_CONTROL mode (due to a DB DROP upcall, for example). QPs are removed from this list, and their accumulated DB write counts written to the DB FIFO. Sets of QPs, called chunks in the code, are removed at one time. The chunk size is 64. So 64 QPs are resumed at a time, and before the next chunk is resumed, the logic waits (blocks) for the DB FIFO to drain. This prevents resuming to quickly and overflowing the FIFO. Once the flow control list is empty, the db state transitions back to NORMAL and user QPs are again allowed to write directly to the user DB register. The algorithm is designed such that if the DB write load is high enough, then all the DB writes get submitted by the kernel using this flow controlled approach to avoid DB drops. As the load lightens though, we resume to normal DB writes directly by user applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Fix QP flush logicSteve Wise2013-08-131-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes following fixes in QP flush logic: - correctly flushes unsignaled WRs followed by a signaled WR - supports for flushing a CQ bound to multiple QPs - resets cidx_flush if a active queue starts getting HW CQEs again - marks WQ in error when we leave RTS. This was only being done for user queues, but we need it for kernel queues too so that post_send/post_recv will start returning the appropriate error synchronously - eats unsignaled read resp CQEs. HW always inserts CQEs so we must silently discard them if the read work request was unsignaled. - handles QP flushes with pending SW CQEs. The flush and out of order completion logic has a bug where if out of order completions are flushed but not yet polled by the consumer and the qp is then flushed then we end up inserting duplicate completions. - c4iw_flush_sq() should only flush wrs that have not already been flushed. Since we already track where in the SQ we've flushed via sq.cidx_flush, just start at that point and flush any remaining. This bug only caused a problem in the presence of unsignaled work requests. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fixed sparse warning due to htonl/ntohl confusion. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Add support for active and passive open connection with IPv6 addressVipul Pandya2013-08-131-2/+2
| | | | | | | | | | | | | | | | | | | | Add new cpl messages, cpl_act_open_req6 and cpl_t5_act_open_req6, for initiating active open connections. Use LLD api cxgb4_create_server and cxgb4_create_server6 for initiating passive open connections. Similarly use cxgb4_remove_server to remove the passive open connections in place of listen_stop. Add support for iWARP over VLAN device and enable IPv6 support on VLAN device. Make use of import_ep in c4iw_reconnect. Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fix build when IPv6 is disabled and make sure iw_cxgb4 is not built-in when ipv6 is a module. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Bump tcam_full stat and WR reply timeoutVipul Pandya2013-03-141-1/+1
| | | | | | | Always bump the tcam_full stat. Also, bump wr reply timeout to 30 seconds. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.Vipul Pandya2013-03-141-1/+1
| | | | | | | | | | | | | It enables direct DMA by HW to memory region PBL arrays and fast register PBL arrays from host memory, vs the T4 way of passing these arrays in the WR itself. The result is lower latency for memory registration, and larger PBL array support for fast register operations. This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Add module_params to enable DB FC & Coalescing on T5Vipul Pandya2013-03-141-0/+1
| | | | | | | Both DB Flow-Control and DB Coalescing are disabled by default on T5 Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud