summaryrefslogtreecommitdiffstats
path: root/net/smc
Commit message (Collapse)AuthorAgeFilesLines
...
* | net/smc: do a few smc_core.c cleanupsHans Wippel2018-05-181-8/+6
| | | | | | | | | | | | | | | | | | This patch consists of Christmas tree fixes and removal of an unneeded function parameter. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: restructure CDC message receptionHans Wippel2018-05-181-25/+22
| | | | | | | | | | | | | | | | | | | | This patch moves a CDC sanity check from smc_cdc_msg_recv_action() to the other sanity checks in smc_cdc_rx_handler(). While doing this, it simplifies smc_cdc_msg_recv() and removes unneeded function parameters. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: move smc_core specific code from smc.h to smc_coreHans Wippel2018-05-183-41/+39
| | | | | | | | | | | | | | | | | | SMC connection and buffer handling belong to smc_core. So, this patch moves this code from smc.h to smc_core. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: calculate write offset in RMB only once per connectionHans Wippel2018-05-183-2/+6
| | | | | | | | | | | | | | | | | | | | Currently, the write offset within the RMB is calculated on each write operation although it is fixed for each connection. With this patch, the offset is calculated once and stored in a connection specific variable. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: rename connection index to RMBE indexHans Wippel2018-05-185-6/+6
| | | | | | | | | | | | | | | | | | The connection index is actually a RMBE index. So, this patch changes the name accordingly. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: move link group list to smc_coreHans Wippel2018-05-184-35/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the global link group list to smc_core where the link group functions are. To make this work, it moves code in af_smc and smc_ib that operates on the link group list to smc_core as well. While at it, the link group counter is integrated into the list structure and initialized to zero. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: add common buffer size in send and receive buffer descriptorsHans Wippel2018-05-189-35/+31
| | | | | | | | | | | | | | | | | | | | | | In addition to the buffer references, SMC currently stores the sizes of the receive and send buffers in each connection as separate variables. This patch introduces a buffer length variable in the common buffer descriptor and uses this length instead. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: init conn.tx_work & conn.send_lock soonerEric Dumazet2018-05-173-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller found that following program crashes the host : { int fd = socket(AF_SMC, SOCK_STREAM, 0); int val = 1; listen(fd, 0); shutdown(fd, SHUT_RDWR); setsockopt(fd, 6, TCP_NODELAY, &val, 4); } Simply initialize conn.tx_work & conn.send_lock at socket creation, rather than deeper in the stack. ODEBUG: assert_init not available (active state 0) object type: timer_list hint: (null) WARNING: CPU: 1 PID: 13988 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 13988 Comm: syz-executor0 Not tainted 4.17.0-rc4+ #46 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 panic+0x22f/0x4de kernel/panic.c:184 __warn.cold.8+0x163/0x1b3 kernel/panic.c:536 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:debug_print_object+0x16a/0x210 lib/debugobjects.c:326 RSP: 0018:ffff880197a37880 EFLAGS: 00010086 RAX: 0000000000000061 RBX: 0000000000000005 RCX: ffffc90001ed0000 RDX: 0000000000004aaf RSI: ffffffff8160f6f1 RDI: 0000000000000001 RBP: ffff880197a378c0 R08: ffff8801aa7a0080 R09: ffffed003b5e3eb2 R10: ffffed003b5e3eb2 R11: ffff8801daf1f597 R12: 0000000000000001 R13: ffffffff88d96980 R14: ffffffff87fa19a0 R15: ffffffff81666ec0 debug_object_assert_init+0x309/0x500 lib/debugobjects.c:692 debug_timer_assert_init kernel/time/timer.c:724 [inline] debug_assert_init kernel/time/timer.c:776 [inline] del_timer+0x74/0x140 kernel/time/timer.c:1198 try_to_grab_pending+0x439/0x9a0 kernel/workqueue.c:1223 mod_delayed_work_on+0x91/0x250 kernel/workqueue.c:1592 mod_delayed_work include/linux/workqueue.h:541 [inline] smc_setsockopt+0x387/0x6d0 net/smc/af_smc.c:1367 __sys_setsockopt+0x1bd/0x390 net/socket.c:1903 __do_sys_setsockopt net/socket.c:1914 [inline] __se_sys_setsockopt net/socket.c:1911 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 01d2f7e2cdd3 ("net/smc: sockopts TCP_NODELAY and TCP_CORK") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ursula Braun <ubraun@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: check for pending terminationKarsten Graul2018-05-163-3/+7
| | | | | | | | | | | | | | | | | | Avoid to run the processing in smc_lgr_terminate() more than once, remember when the link group termination is triggered. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: drop messages when link state is inactiveKarsten Graul2018-05-161-0/+2
| | | | | | | | | | | | | | | | Drop incoming messages when the link is flagged as inactive. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: set link inactive before calling smc_lgr_free()Karsten Graul2018-05-162-1/+5
| | | | | | | | | | | | | | | | | | Before smc_lgr_free() is called the link must be set inactive by calling smc_llc_link_inactive(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: handle all error codes from smc_conn_create()Karsten Graul2018-05-161-0/+2
| | | | | | | | | | | | | | | | Always set a reason_code when smc_conn_create() returns an error code. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: use a workqueue to defer llc sendKarsten Graul2018-05-164-43/+104
| | | | | | | | | | | | | | | | | | | | | | SMC handles deferred work in tasklets. As tasklets cannot sleep this can result in rare EBUSY conditions, so defer this work in a work queue. The high level api functions do not defer work because they can sleep until the llc send is actually completed. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: move link llc initialization to llc layerKarsten Graul2018-05-163-6/+12
| | | | | | | | | | | | | | | | | | | | Move the llc layer specific initialization and cleanup out of smc_core.c into smc_llc.c (smc_llc_link_init and smc_llc_link_clear). Move all initialization of a link into the new init function. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: simplify test_link function usageKarsten Graul2018-05-162-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | Make smc_llc_send_test_link() static and remove it from the header file. And to send a test_link response set the response flag and send the message back as-is, without using smc_llc_send_test_link(). And because smc_llc_send_test_link() must no longer send responses, remove the response flag handling from the function. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: remove unnecessary castKarsten Graul2018-05-161-3/+3
| | | | | | | | | | | | | | | | | | Remove an unneeded (void *) cast from the calls to smc_llc_send_message(). No functional changes. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: register new rmbs with the peerKarsten Graul2018-05-165-8/+64
| | | | | | | | | | | | | | | | | | Register new rmb buffers with the remote peer by exchanging a confirm_rkey llc message. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: no tx work trigger for fallback socketsUrsula Braun2018-05-161-2/+2
| | | | | | | | | | | | | | | | | | | | If TCP_NODELAY is set or TCP_CORK is reset, setsockopt triggers the tx worker. This does not make sense, if the SMC socket switched to the TCP fallback when the connection is created. This patch adds the additional check for the fallback case. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | smc: add support for splice()Stefan Raspl2018-05-044-25/+185
| | | | | | | | | | | | | | | | | | Provide an implementation for splice() when we are using SMC. See smc_splice_read() for further details. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>< Signed-off-by: David S. Miller <davem@davemloft.net>
* | smc: allocate RMBs as compound pagesStefan Raspl2018-05-042-9/+10
| | | | | | | | | | | | | | | | Preparatory work for splice() support. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>< Signed-off-by: David S. Miller <davem@davemloft.net>
* | smc: make smc_rx_wait_data() genericStefan Raspl2018-05-043-12/+19
| | | | | | | | | | | | | | | | | | Turn smc_rx_wait_data into a generic function that can be used at various instances to wait on traffic to complete with varying criteria. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>< Signed-off-by: David S. Miller <davem@davemloft.net>
* | smc: simplify abort logicStefan Raspl2018-05-041-10/+6
| | | | | | | | | | | | | | | | | | Some of the conditions to exit recv() are common in two pathes - cleaning up code by moving the check up so we have it only once. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>< Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-05-043-32/+50
|\| | | | | | | | | | | Overlapping changes in selftests Makefile. Signed-off-by: David S. Miller <davem@davemloft.net>
| * smc: fix sendpage() callStefan Raspl2018-05-031-2/+4
| | | | | | | | | | | | | | | | | | The sendpage() call grabs the sock lock before calling the default implementation - which tries to grab it once again. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>< Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: handle unregistered buffersKarsten Graul2018-05-033-5/+24
| | | | | | | | | | | | | | | | | | When smc_wr_reg_send() fails then tag (regerr) the affected buffer and free it in smc_buf_unuse(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: call consolidationKarsten Graul2018-05-031-20/+15
| | | | | | | | | | | | | | | | | | Consolidate the call to smc_wr_reg_send() in a new function. No functional changes. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: restrict non-blocking connect finishUrsula Braun2018-05-021-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The smc_poll code tries to finish connect() if the socket is in state SMC_INIT and polling of the internal CLC-socket returns with EPOLLOUT. This makes sense for a select/poll call following a connect call, but not without preceding connect(). With this patch smc_poll starts connect logic only, if the CLC-socket is no longer in its initial state TCP_CLOSE. In addition, a poll error on the internal CLC-socket is always propagated to the SMC socket. With this patch the code path mentioned by syzbot https://syzkaller.appspot.com/bug?extid=03faa2dc16b8b64be396 is no longer possible. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Reported-by: syzbot+03faa2dc16b8b64be396@syzkaller.appspotmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: determine vlan_id of stacked net_deviceUrsula Braun2018-05-021-3/+23
| | | | | | | | | | | | | | | | | | | | An SMC link group is bound to a specific vlan_id. Its link uses the RoCE-GIDs established for the specific vlan_id. This patch makes sure the appropriate vlan_id is determined for stacked scenarios like for instance a master bonding device with vlan devices enslaved. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: handle ioctls SIOCINQ, SIOCOUTQ, and SIOCOUTQNSDUrsula Braun2018-05-021-3/+30
| | | | | | | | | | | | | | | | | | | | | | SIOCINQ returns the amount of unread data in the RMB. SIOCOUTQ returns the amount of unsent or unacked sent data in the send buffer. SIOCOUTQNSD returns the amount of data prepared for sending, but not yet sent. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: ipv6 support for smc_diag.cKarsten Graul2018-05-021-9/+30
| | | | | | | | | | | | | | | | Update smc_diag.c to support ipv6 addresses on the diagnosis interface. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: periodic testlink supportKarsten Graul2018-05-026-3/+75
| | | | | | | | | | | | | | | | | | | | Add periodic LLC testlink support to ensure the link is still active. The interval time is initialized using the value of sysctl_tcp_keepalive_time. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: handle sockopt TCP_DEFER_ACCEPTUrsula Braun2018-04-274-2/+31
| | | | | | | | | | | | | | | | If sockopt TCP_DEFER_ACCEPT is set, the accept is delayed till data is available. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: sockopts TCP_NODELAY and TCP_CORKUrsula Braun2018-04-272-4/+40
| | | | | | | | | | | | | | | | | | | | | | Setting sockopt TCP_NODELAY or resetting sockopt TCP_CORK triggers data transfer. For a corked SMC socket RDMA writes are deferred, if there is still sufficient send buffer space available. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: handle sockopts forcing fallbackUrsula Braun2018-04-271-4/+50
| | | | | | | | | | | | | | | | | | | | | | Several TCP sockopts do not work for SMC. One example are the TCP_FASTOPEN sockopts, since SMC-connection setup is based on the TCP three-way-handshake. If the SMC socket is still in state SMC_INIT, such sockopts trigger fallback to TCP. Otherwise an error is returned. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: fix structure sizeKarsten Graul2018-04-272-2/+2
|/ | | | | | | | | | The struct smc_cdc_msg must be defined as packed so the size is 44 bytes. And change the structure size check so sizeof is checked. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/smc: keep clcsock reference in smc_tcp_listen_work()Ursula Braun2018-04-251-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | The internal CLC socket should exist till the SMC-socket is released. Function tcp_listen_worker() releases the internal CLC socket of a listen socket, if an smc_close_active() is called. This function is called for the final release(), but it is called for shutdown SHUT_RDWR as well. This opens a door for protection faults, if socket calls using the internal CLC socket are called for a shutdown listen socket. With the changes of commit 3d502067599f ("net/smc: simplify wait when closing listen socket") there is no need anymore to release the internal CLC socket in function tcp_listen_worker((). It is sufficient to release it in smc_release(). Fixes: 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Reported-by: syzbot+9045fc589fcd196ef522@syzkaller.appspotmail.com Reported-by: syzbot+28a2c86cf19c81d871fa@syzkaller.appspotmail.com Reported-by: syzbot+9605e6cace1b5efd4a0a@syzkaller.appspotmail.com Reported-by: syzbot+cf9012c597c8379d535c@syzkaller.appspotmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
* net/smc: fix shutdown in state SMC_LISTENUrsula Braun2018-04-191-6/+4
| | | | | | | | | | | | | | | | Calling shutdown with SHUT_RD and SHUT_RDWR for a listening SMC socket crashes, because commit 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") releases the internal clcsock in smc_close_active() and sets smc->clcsock to NULL. For SHUT_RD the smc_close_active() call is removed. For SHUT_RDWR the kernel_sock_shutdown() call is omitted, since the clcsock is already released. Fixes: 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-04-011-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: use announced length in sock_recvmsg()Ursula Braun2018-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | Not every CLC proposal message needs the maximum buffer length. Due to the MSG_WAITALL flag, it is important to use the peeked real length when receiving the message. Fixes: d63d271ce2b5ce ("smc: switch to sock_recvmsg()") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-03-232-26/+3
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: simplify wait when closing listen socketUrsula Braun2018-03-152-26/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Closing of a listen socket wakes up kernel_accept() of smc_tcp_listen_worker(), and then has to wait till smc_tcp_listen_worker() gives up the internal clcsock. The wait logic introduced with commit 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") might wait longer than necessary. This patch implements the idea to implement the wait just with flush_work(), and gets rid of the extra smc_close_wait_listen_clcsock() function. Fixes: 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") Reported-by: Hans Wippel <hwippel@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: enable ipv6 support for smcKarsten Graul2018-03-162-17/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | Add ipv6 support to the smc socket layer functions. Make use of the updated clc layer functions to retrieve and match ipv6 information. The indicator for ipv4 or ipv6 is the protocol constant that is provided in the socket() call with address family AF_SMC. Based-on-patch-by: Takanori Ueda <tkueda@jp.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: add ipv6 support to CLC layerKarsten Graul2018-03-162-17/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The CLC layer is updated to support ipv6 proposal messages from peers and to match incoming proposal messages against the ipv6 addresses of the net device. struct smc_clc_ipv6_prefix is updated to provide the space for an ipv6 address (struct was not used before). SMC_CLC_MAX_LEN is updated to include the size of the proposal prefix. Existing code in net is not affected, the previous SMC_CLC_MAX_LEN value is large enough to hold ipv4 proposal messages. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: restructure netinfo for CLC proposal msgsKarsten Graul2018-03-163-36/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce functions smc_clc_prfx_set to retrieve IP information for the CLC proposal msg and smc_clc_prfx_match to match the contents of a proposal message against the IP addresses of the net device. The new functions replace the functionality provided by smc_clc_netinfo_by_tcpsk, which is removed by this patch. The match functionality is extended to scan all ipv4 addresses of the net device for a match against the ipv4 subnet from the proposal msg. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: schedule free_work when link group is terminatedKarsten Graul2018-03-141-7/+13
| | | | | | | | | | | | | | | | | | The free_work worker must be scheduled when the link group is abnormally terminated. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: free link group without pending free_work onlyUrsula Braun2018-03-142-1/+3
| | | | | | | | | | | | | | | | Make sure there is no pending or running free_work worker for the link group when freeing the link group. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/smc: pay attention to MAX_ORDER for CQ entriesUrsula Braun2018-03-142-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smc allocates a certain number of CQ entries for used RoCE devices. For mlx5 devices the chosen constant number results in a large allocation causing this warning: [13355.124656] WARNING: CPU: 3 PID: 16535 at mm/page_alloc.c:3883 __alloc_pages_nodemask+0x2be/0x10c0 [13355.124657] Modules linked in: smc_diag(O) smc(O) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter mlx5_ib ib_core sunrpc mlx5_core s390_trng rng_core ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common ptp pps_core eadm_sch dm_multipath dm_mod vhost_net tun vhost tap sch_fq_codel kvm ip_tables x_tables autofs4 [last unloaded: smc] [13355.124672] CPU: 3 PID: 16535 Comm: kworker/3:0 Tainted: G O 4.14.0uschi #1 [13355.124673] Hardware name: IBM 3906 M04 704 (LPAR) [13355.124675] Workqueue: events smc_listen_work [smc] [13355.124677] task: 00000000e2f22100 task.stack: 0000000084720000 [13355.124678] Krnl PSW : 0704c00180000000 000000000029da76 (__alloc_pages_nodemask+0x2be/0x10c0) [13355.124681] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [13355.124682] Krnl GPRS: 0000000000000000 00550e00014080c0 0000000000000000 0000000000000001 [13355.124684] 000000000029d8b6 00000000f3bfd710 0000000000000000 00000000014080c0 [13355.124685] 0000000000000009 00000000ec277a00 0000000000200000 0000000000000000 [13355.124686] 0000000000000000 00000000000001ff 000000000029d8b6 0000000084723720 [13355.124708] Krnl Code: 000000000029da6a: a7110200 tmll %r1,512 000000000029da6e: a774ff29 brc 7,29d8c0 #000000000029da72: a7f40001 brc 15,29da74 >000000000029da76: a7f4ff25 brc 15,29d8c0 000000000029da7a: a7380000 lhi %r3,0 000000000029da7e: a7f4fef1 brc 15,29d860 000000000029da82: 5820f0c4 l %r2,196(%r15) 000000000029da86: a53e0048 llilh %r3,72 [13355.124720] Call Trace: [13355.124722] ([<000000000029d8b6>] __alloc_pages_nodemask+0xfe/0x10c0) [13355.124724] [<000000000013bd1e>] s390_dma_alloc+0x6e/0x148 [13355.124733] [<000003ff802eeba6>] mlx5_dma_zalloc_coherent_node+0x8e/0xe0 [mlx5_core] [13355.124740] [<000003ff802eee18>] mlx5_buf_alloc_node+0x70/0x108 [mlx5_core] [13355.124744] [<000003ff804eb410>] mlx5_ib_create_cq+0x558/0x898 [mlx5_ib] [13355.124749] [<000003ff80407d40>] ib_create_cq+0x48/0x88 [ib_core] [13355.124751] [<000003ff80109fba>] smc_ib_setup_per_ibdev+0x52/0x118 [smc] [13355.124753] [<000003ff8010bcb6>] smc_conn_create+0x65e/0x728 [smc] [13355.124755] [<000003ff801081a2>] smc_listen_work+0x2d2/0x540 [smc] [13355.124756] [<0000000000162c66>] process_one_work+0x1be/0x440 [13355.124758] [<0000000000162f40>] worker_thread+0x58/0x458 [13355.124759] [<0000000000169e7e>] kthread+0x14e/0x168 [13355.124760] [<00000000009ce8be>] kernel_thread_starter+0x6/0xc [13355.124762] [<00000000009ce8b8>] kernel_thread_starter+0x0/0xc [13355.124762] Last Breaking-Event-Address: [13355.124764] [<000000000029da72>] __alloc_pages_nodemask+0x2ba/0x10c0 [13355.124764] ---[ end trace 34be38b581c0b585 ]--- This patch reduces the smc constant for the maximum number of allocated completion queue entries SMC_MAX_CQE by 2 to avoid high round up values in the mlx5 code, and reduces the number of allocated completion queue entries even more, if the final allocation for an mlx5 device hits the MAX_ORDER limit. Reported-by: Ihnken Menssen <menssen@de.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-03-064-4/+7
|\| | | | | | | | | | | | | | | | | | | All of the conflicts were cases of overlapping changes. In net/core/devlink.c, we have to make care that the resouce size_params have become a struct member rather than a pointer to such an object. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: fix NULL pointer dereference on sock_create_kern() error pathDavide Caratti2018-02-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when sock_create_kern(..., a) returns an error, 'a' might not be a valid pointer, so it shouldn't be dereferenced to read a->sk->sk_sndbuf and and a->sk->sk_rcvbuf; not doing that caused the following crash: general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4254 Comm: syzkaller919713 Not tainted 4.16.0-rc1+ #18 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: 0018:ffff8801b06afbc8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8801b63457c0 RCX: ffffffff85a3e746 RDX: 0000000000000004 RSI: 00000000ffffffff RDI: 0000000000000020 RBP: ffff8801b06afbf0 R08: 00000000000007c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8801b6345c08 R14: 00000000ffffffe9 R15: ffffffff8695ced0 FS: 0000000001afb880(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 00000001b0721004 CR4: 00000000001606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __sock_create+0x4d4/0x850 net/socket.c:1285 sock_create net/socket.c:1325 [inline] SYSC_socketpair net/socket.c:1409 [inline] SyS_socketpair+0x1c0/0x6f0 net/socket.c:1366 do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x4404b9 RSP: 002b:00007fff44ab6908 EFLAGS: 00000246 ORIG_RAX: 0000000000000035 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004404b9 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000002b RBP: 00007fff44ab6910 R08: 0000000000000002 R09: 00007fff44003031 R10: 0000000020000040 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000 Code: 48 c1 ea 03 80 3c 02 00 0f 85 b3 01 00 00 4c 8b a3 48 04 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 20 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 82 01 00 00 4d 8b 7c 24 20 48 b8 00 00 00 00 RIP: smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: ffff8801b06afbc8 Fixes: cd6851f30386 smc: remote memory buffers (RMBs) Reported-and-tested-by: syzbot+aa0227369be2dcc26ebe@syzkaller.appspotmail.com Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: use link_id of server in confirm link replyKarsten Graul2018-02-282-1/+2
| | | | | | | | | | | | | | | | | | | | The CONFIRM LINK reply message must contain the link_id sent by the server. And set the link_id explicitly when initializing the link. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud