summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/ulp
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | IB/iser: Enable SG clusteringSagi Grimberg2015-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iser is perfectly capable supporting SG clustering as it translates the SG list to a page vector. Enabling SG clustering can dramatically reduce the number of SG elements, which doesn't make much of a difference at this point, but with arbitrary SG list support, reducing the number of SG elements can benefit greatly as as it would reduce the length of the HW descriptors array. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/iser: set block queue_virt_boundarySagi Grimberg2015-10-284-326/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block layer can reliably guarantee that SG lists won't contain gaps (page unaligned) if a driver set the queue virt_boundary. With this setting the block layer will: - refuse merges if bios are not aligned to the virtual boundary - split bios/requests that are not aligned to the virtual boundary - or, bounce buffer SG_IOs that are not aligned to the virtual boundary Since iser is working in 4K page size, set the virt_boundary to 4K pages. With this setting, we can now safely remove the bounce buffering logic in iser. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | iser-target: Remove an unused variableBart Van Assche2015-10-221-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Detected this by compiling with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/iser: Remove an unused variableBart Van Assche2015-10-221-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Detected this by compiling with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/core: Add netdev and gid attributes paramteres to cacheMatan Barak2015-10-214-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding an ability to query the IB cache by a netdev and get the attributes of a GID. These parameters are necessary in order to successfully resolve the required GID (when the netdevice is known) and get the Ethernet L2 attributes from a GID. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/iser: fix a comment typoGeliang Tang2015-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just fix a typo in the code comment. Signed-off-by: Geliang Tang <geliangtang@163.com> Acked-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | Merge branch 'k.o/for-4.3-v1' into k.o/for-4.4Doug Ledford2015-10-217-28/+70
|\ \ \ \ | |/ / / |/| | | | | | | | | | | Pick up the late fixes from the 4.3 cycle so we have them in our next branch.
| * | | IB/ipoib: For sendonly join free the multicast group on leaveChristoph Lameter2015-10-133-2/+5
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we leave the multicast group on expiration of a neighbor we do not free the mcast structure. This results in a memory leak that causes ib_dealloc_pd to fail and print a WARN_ON message and backtrace. Fixes: bd99b2e05c4d (IB/ipoib: Expire sendonly multicast joins) Signed-off-by: Christoph Lameter <cl@linux.com> Tested-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/ipoib: increase the max mcast backlog queueDoug Ledford2015-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When performing sendonly joins, we queue the packets that trigger the join until the join completes. This may take on the order of hundreds of milliseconds. It is easy to have many more than three packets come in during that time. Expand the maximum queue depth in order to try and prevent dropped packets during the time it takes to join the multicast group. Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/ipoib: Make sendonly multicast joins create the mcast groupDoug Ledford2015-09-251-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since IPoIB should, as much as possible, emulate how multicast sends work on Ethernet for regular TCP/IP apps, there should be no requirement to subscribe to a multicast group before your sends are properly sent. However, due to the difference in how multicast is handled on InfiniBand, we must join the appropriate multicast group before we can send to it. Previously we tried not to trigger the auto-create feature of the subnet manager when doing this because we didn't have tracking of these sendonly groups and the auto-creation might never get undone. The previous patch added timing to these sendonly joins and allows us to leave them after a reasonable idle expiration time. So supply all of the information needed to auto-create group. Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/ipoib: Expire sendonly multicast joinsChristoph Lameter2015-09-253-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | On neighbor expiration, check to see if the neighbor was actually a sendonly multicast join, and if so, leave the multicast group as we expire the neighbor. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/iser: Add module parameter for always register memorySagi Grimberg2015-09-254-14/+31
| |/ | | | | | | | | | | | | | | | | This module parameter forces memory registration even for a continuous memory region. It is true by default as sending an all-physical rkey with remote permissions might be insecure. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | iser-target: Skip data copy if all the command data comes as immediateJenny Derzhavetz2015-09-152-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that supporting zcopy immediate data for all IOs requires iser driver to use its own buffer allocations, we settle with avoiding data copy for IOs with data length of up to 8K (which is more latency sensitive anyway). This trims IO write latency by up to 3us and increase IOPs by up to 40% by saving CPU time doing sg_copy_from_buffer (8K IO size is the obvious winner here). Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: Change the recv buffers posting logicJenny Derzhavetz2015-09-152-48/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iser target batches post recv operations to avoid the overhead of acquiring the recv queue lock and posting a HW doorbell for each command. We change it to be per command in order to support zcopy immediate data for IOs that fits in the 8K transfer boundary (in the next patch). (Fix minor patch fuzz due to ib_mr removal - nab) Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: Fix pending connections handling in target stack shutdown sequnceJenny Derzhavetz2015-09-152-31/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of handing a connection to the iscsi stack for processing right after accepting (rdma_accept) we only hand the connection to the iscsi core after we reached to a connected state (ESTABLISHED CM event). This will prevent two error scenrios: 1. race between rdma connection teardown and iscsi login sequence reported by Nic in: (ce9a9fc20a78a "iser-target: Fix REJECT CM event use-after-free OOPs") 2. target stack shutdown sequence race with constant login attempts by multiple initiators. We address this by maintaining two queues at the isert_np level: - accepted: connections that were accepted but have not reached connected state (might get rejected, unreachable or error). - pending: connections in connected state, but have yet to handed to the iscsi core for login processing. iser connections are promoted to the pending queue only from the accepted queue. This way the iscsi core now will only handle functional iser connections and once we shutdown the target stack, we look for any stales that got left behind so we can safely release them. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: Remove np_ prefix from isert_np membersJenny Derzhavetz2015-09-152-33/+33
| | | | | | | | | | | | | | | | | | These are always referenced from np-> so no need for the prefix. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: Remove unused variablesJenny Derzhavetz2015-09-152-6/+0
| | | | | | | | | | | | Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: Put the reference on commands waiting for unsol dataJenny Derzhavetz2015-09-151-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iscsi target core teardown sequence calls wait_conn for all active commands to finish gracefully by: - move the queue-pair to error state - drain all the completions - wait for the core to finish handling all session commands However, when tearing down a session while there are sequenced commands that are still waiting for unsolicited data outs, we can block forever as these are missing an extra reference put. We basically need the equivalent of iscsit_free_queue_reqs_for_conn() which is called after wait_conn has returned. Address this by an explicit walk on conn_cmd_list and put the extra reference. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | iser-target: remove command with state ISTATE_REMOVEJenny Derzhavetz2015-09-151-1/+8
|/ | | | | | | | | | | | | | | | | | | As documented in iscsit_sequence_cmd: /* * Existing callers for iscsit_sequence_cmd() will silently * ignore commands with CMDSN_LOWER_THAN_EXP, so force this * return for CMDSN_MAXCMDSN_OVERRUN as well.. */ We need to silently finish a command when it's in ISTATE_REMOVE. This fixes an teardown hang we were seeing where a mis-behaved initiator (triggered by allocation error injections) sent us a cmdsn which was lower than expected. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* Merge branch 'for-next' of ↵Linus Torvalds2015-09-111-25/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Here are the outstanding target-pending updates for v4.3-rc1. Mostly bug-fixes and minor changes this round. The fallout from the big v4.2-rc1 RCU conversion have (thus far) been minimal. The highlights this round include: - Move sense handling routines into scsi_common code (Sagi) - Return ABORTED_COMMAND sense key for PI errors (Sagi) - Add tpg_enabled_sendtargets attribute for disabled iscsi-target discovery (David) - Shrink target struct se_cmd by rearranging fields (Roland) - Drop iSCSI use of mutex around max_cmd_sn increment (Roland) - Replace iSCSI __kernel_sockaddr_storage with sockaddr_storage (Andy + Chris) - Honor fabric max_data_sg_nents I/O transfer limit (Arun + Himanshu + nab) - Fix EXTENDED_COPY >= v4.1 regression OOPsen (Alex + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (37 commits) target: use stringify.h instead of own definition target/user: Fix UFLAG_UNKNOWN_OP handling target: Remove no-op conditional target/user: Remove unused variable target: Fix max_cmd_sn increment w/o cmdsn mutex regressions target: Attach EXTENDED_COPY local I/O descriptors to xcopy_pt_sess target/qla2xxx: Honor max_data_sg_nents I/O transfer limit target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storage target/iscsi: Replace conn->login_ip with login_sockaddr target/iscsi: Keep local_ip as the actual sockaddr target/iscsi: Fix np_ip bracket issue by removing np_ip target: Drop iSCSI use of mutex around max_cmd_sn increment qla2xxx: Update tcm_qla2xxx module description to 24xx+ iscsi-target: Add tpg_enabled_sendtargets for disabled discovery drivers: target: Drop unlikely before IS_ERR(_OR_NULL) target: check DPO/FUA usage for COMPARE AND WRITE target: Shrink struct se_cmd by rearranging fields target: Remove cmd->se_ordered_id (unused except debug log lines) target: add support for START_STOP_UNIT SCSI opcode target: improve unsupported opcode message ...
| * target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storageAndy Grover2015-08-261-2/+2
| | | | | | | | | | | | | | It appears to be what the rest of the kernel does, so let's do it too. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target/iscsi: Replace conn->login_ip with login_sockaddrAndy Grover2015-08-261-19/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Very similar to how it went with local_sockaddr. It was embedded in iscsi_login_stats so some changes there, and we needed to copy in a sockaddr_storage comparison function. Hopefully the kernel will get a standard one soon, our implementation makes the 3rd. isert_set_conn_info() became much smaller. IPV6_ADDRESS_SPACE define goes away, had to modify a call to in6_pton(), can just use -1 since we are sure string is null-terminated. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target/iscsi: Keep local_ip as the actual sockaddrAndy Grover2015-08-261-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a more natural format that lets us format it with the appropriate printk specifier as needed. This also lets us handle v4-mapped ipv6 addresses a little more nicely, by storing the addr as an actual v4 sockaddr in conn->local_sockaddr. Finally, we no longer need to maintain variables for port, since this is contained in sockaddr. Remove iscsi_np.np_port and iscsi_conn.local_port. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | IB/ipoib: Suppress warning for send only join failuresJason Gunthorpe2015-09-031-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We expect send only joins to fail, it just means there are no listeners for the group. The correct thing to do is silently drop the packet at source. Eg avahi will full join 224.0.0.251 which causes a send only IGMP packet to 224.0.0.22, and then a warning level kmessage like this: ib0: sendonly multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 If there is no IP router listening to IGMP. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/ipoib: Clean up send-only multicast joinsDoug Ledford2015-09-031-11/+27
| | | | | | | | | | | | | | | | | | Even though we don't expect the group to be created by the SM we sill need to provide all the parameters to force the SM to validate they are correct. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Fix possible protection faultSagi Grimberg2015-09-031-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | srp_destroy_qp is designed to indicate we are safe to continue with freeing the channel resources by modifying the qp error state, posting a dummy wr on the queue-pair and waiting for it to flush. This also holds for the channel registration pool as we are unmapping the memory region when handling a scsi response. Destroying the channel registration pool before we make sure we processed all the inflight IO might introduce a use-after-free of the registration pool. This use-after-free is demonstrated in the stack trace below where srp is trying to unmap a used FMR after the fmr_pool was already destroyed. general protection fault: 0000 [#1] SMP RIP: 0010:[<ffffffff8151121b>] [<ffffffff8151121b>] _raw_spin_lock_irqsave+0x1b/0x50 Call Trace: [<ffffffffa055d88a>] ib_fmr_pool_unmap+0x1a/0xb0 [ib_core] [<ffffffffa06c00ed>] srp_unmap_data.isra.28+0x17d/0x250 [ib_srp] [<ffffffffa06c01eb>] srp_free_req+0x2b/0x60 [ib_srp] [<ffffffffa06c0c94>] srp_recv_completion+0x174/0x580 [ib_srp] [<ffffffffa04580fe>] mlx4_eq_int+0x4de/0xe50 [mlx4_core] [<ffffffffa0458b00>] mlx4_msi_x_interrupt+0x10/0x20 [mlx4_core] [<ffffffff810abc45>] handle_irq_event_percpu+0x35/0x1b0 [<ffffffff810abdf2>] handle_irq_event+0x32/0x50 [<ffffffff810ae5cf>] handle_edge_irq+0x6f/0x120 [<ffffffff8100455a>] handle_irq+0x1a/0x30 [<ffffffff8151b475>] do_IRQ+0x45/0xb0 [<ffffffff8151162d>] common_interrupt+0x6d/0x6d [<ffffffff813e4d2f>] cpuidle_enter_state+0x4f/0xc0 [<ffffffff813e4e6c>] cpuidle_idle_call+0xcc/0x210 [<ffffffff8100b9ea>] arch_cpu_idle+0xa/0x30 [<ffffffff810ab1e1>] cpu_startup_entry+0xe1/0x270 [<ffffffff81030b3a>] start_secondary+0x21a/0x2c0 Reported-by: Eliott Kespi <eliottk@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Make ib_dealloc_pd return voidJason Gunthorpe2015-08-302-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The majority of callers never check the return value, and even if they did, they can't do anything about a failure. All possible failure cases represent a bug in the caller, so just WARN_ON inside the function instead. This fixes a few random errors: net/rd/iw.c infinite loops while it fails. (racing with EBUSY?) This also lays the ground work to get rid of error return from the drivers. Most drivers do not error, the few that do are broken since it cannot be handled. Since uverbs can legitimately make use of EBUSY, open code the check. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Create an insecure all physical rkey only if neededBart Van Assche2015-08-302-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SRP initiator only needs this if the insecure register_always=N performance optimization is enabled, or if FRWR/FMR is not supported in the driver. Do not create an all physical MR unless it is needed to support either of those modes. Default register_always to true so the out of the box configuration does not create an insecure all physical MR. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [bvanassche: reworked and rebased this patch] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Register the indirect data buffer descriptorBart Van Assche2015-08-302-3/+58
| | | | | | | | | | | | | | | | | | Instead of always using the global rkey for the indirect data buffer descriptor, register that descriptor with the HCA if the kernel module parameter register_always has been set to Y. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Introduce srp_device.use_fmrBart Van Assche2015-08-302-11/+12
| | | | | | | | | | | | | | | | | | | | Introduce the variable srp_device.use_fmr. Leave out the dev->has_fr / dev->has_fmr and ch->fr_pool / ch->fmr_pool checks since these are redundant. This patch does not change any functionality but makes the source code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Remove use_mr argument from srp_map_sg_entry()Bart Van Assche2015-08-301-21/+15
| | | | | | | | | | | | | | | | | | Move the srp_map_desc() call from inside srp_map_sg_entry() to srp_map_sg() such that the use_mr argument can be removed from srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Remove the memory registration backtracking codeBart Van Assche2015-08-302-55/+13
| | | | | | | | | | | | | | | | | | | | | | Mapping a discontiguous sg-list requires multiple memory regions and hence can exhaust the memory region pool. The SRP initiator already handles this by temporarily reducing the queue depth. This means that it is safe to remove the memory registration backtracking code. This patch has been tested with direct I/O sizes up to 256 MB. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Add memory descriptor array pointer range checkingBart Van Assche2015-08-302-6/+20
| | | | | | | | | | | | | | | | | | | | Although most paths through which a request is submitted check block layer parameters like the max_segments limit, these are not checked when an SG_IO or direct I/O request is submitted. Hence add a range check for the memory descriptor array pointer. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Use multiple registrations for large memory regionsBart Van Assche2015-08-301-10/+0
| | | | | | | | | | | | | | | | | | Instead of using the global rkey for large memory regions, use multiple registrations. See also the while (dma_len) loop further down in srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Re-enable FMR for non-page aligned buffersBart Van Assche2015-08-301-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | During a discussion in 2011 nobody recalled why FMR was not used for non-page aligned buffers (see also http://thread.gmane.org/gmane.linux.drivers.rdma/7149). Re-enable FMR for such buffers. For the reason why the srp_map_fmr() function needs to be modified, see also patch "IB/srp: rework mapping engine to use multiple FMR entries" (commit ID 8f26c9ff9cd0; January 2011). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | ib_srpt: Remove ib_get_dma_mr callsJason Gunthorpe2015-08-302-12/+4
| | | | | | | | | | | | | | | | | | The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/srp: Use pd->local_dma_lkeyJason Gunthorpe2015-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and will have to be fixed separately. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | iser-target: Remove ib_get_dma_mr callsJason Gunthorpe2015-08-302-23/+11
| | | | | | | | | | | | | | | | | | The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Use pd->local_dma_lkeyJason Gunthorpe2015-08-304-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and this looks trivially fixed by forcing the use of registration in a future patch. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/ipoib: Remove ib_get_dma_mr callsJason Gunthorpe2015-08-303-17/+4
| | | | | | | | | | | | | | | | The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Chain all iser transaction send work requestsSagi Grimberg2015-08-304-77/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chaning of send work requests benefits performance by reducing the send queue lock contention (acquired in ib_post_send) and saves us HW doorbells which is posted only once. Currently, in normal IO flows iser does not chain the CDB send work request with the registration work request. Also in PI flows, signature work requests are not chained as well. Lets chain those and post only once. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Add debug prints to the various memory registration methodsSagi Grimberg2015-08-301-1/+9
| | | | | | | | | | | | | | | | | | | | Easier to debug when we have the registration details. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Support up to 8MB data transfer in a single commandSagi Grimberg2015-08-305-7/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iser support up to 512KB data transfer in a single scsi command. This means that larger IOs will split to different request. While iser can easily saturate FDR/EDR wires, some arrays are fine tuned for 1MB (or larger) IO sizes, hence add an option to support larger transfers (up to 8MB) if the device allows it. Given that a few target implementations don't support data transfers of more than 512KB by default and the fact that larger IO sizes require more resources, we introduce a module parameter to determine the maximum number of 512B sectors in a single scsi command. Users that are interested in larger transfers can change this value given that the target supports larger transfers. At the moment, iser works in 4K pages granularity, In a later stage we will get it to work with system page size instead. IO operations that consists of N pages will need a page vector of size N+1 in case the first SG element contains an offset. Given that some devices allocates memory regions in powers of 2, this means that allocating a region with N+1 pages, will result in region resources allocation of the next power of 2. Since we don't want that to happen, in case we are in the limit of IO size supported and the first SG element has an offset, we align the SG list using a bounce buffer (which is OK given that this is not likely to happen a lot). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Pass registration pool a size parameterSagi Grimberg2015-08-303-25/+36
| | | | | | | | | | | | | | | | | | | | | | Hard coded for now. This will allow to allocate different sized MRs depending on the IO size needed (and device capabilities). This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Unify fast memory registration flowsSagi Grimberg2015-08-303-131/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iser_reg_rdma_mem_[fastreg|fmr] share a lot of code, and logically do the same thing other than the buffer registration method itself (iser_fast_reg_mr vs. iser_fast_reg_fmr). The DIF logic is not implemented in the FMR flow as there is no existing device that supports FMRs and Signature feature. This patch unifies the flow in a single routine iser_reg_rdma_mem and just split to fmr/frwr for the buffer registration itself. Also, for symmetry reasons, unify iser_unreg_rdma_mem (which will call the relevant device specific unreg routine). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Make reg_desc_get a per device routineSagi Grimberg2015-08-302-13/+41
| | | | | | | | | | | | | | | | | | | | | | As for fmrs we will hold a single registration descriptor as no need for multiple like in the frwr mode (descriptor for each task). This change helps unifying the duplicate registration code paths. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Rename iser_reg_page_vec to iser_fast_reg_fmrSagi Grimberg2015-08-301-8/+8
| | | | | | | | | | | | | | | | | | | | Also, change a name of a local variable. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Maintain connection fmr_pool under a single registration descriptorAdir Lev2015-08-303-54/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow us to unify the memory registration code path between the various methods which vary by the device capabilities. This change will make it easier and less intrusive to remove fmr_pools from the code when we'd want to. The reason we use a single descriptor is to avoid taking a redundant spinlock when working with FMRs. We also change the signature of iser_reg_page_vec to make it match iser_fast_reg_mr (and the future indirect registration method). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Introduce iser registration pool structSagi Grimberg2015-08-303-59/+82
| | | | | | | | | | | | | | | | | | | | | | Instead of having it a part of the connection structure, have it be under a dedicated (embedded) structure in the connection. A logical separation of the registration pool and the connection structure. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: Move fastreg descriptor allocation to iser_create_fastreg_descSagi Grimberg2015-08-301-21/+17
| | | | | | | | | | | | | | | | | | | | | | Don't have the caller allocate the structure and worry about freeing it in case the routine failed. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
OpenPOWER on IntegriCloud