summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fib_trie: coding style: Use pointer after checkFiro Yang2015-06-071-8/+13
| | | | | | | | | | | | | | | | | | | | As Alexander Duyck pointed out that: struct tnode { ... struct key_vector kv[1]; } The kv[1] member of struct tnode is an arry that refernced by a null pointer will not crash the system, like this: struct tnode *p = NULL; struct key_vector *kv = p->kv; As such p->kv doesn't actually dereference anything, it is simply a means for getting the offset to the array from the pointer p. This patch make the code more regular to avoid making people feel odd when they look at the code. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* wan: dscc4: use msecs_to_jiffies for conversionsNicholas Mc Guire2015-06-071-3/+3
| | | | | | | | | | | | | | | | | | API compliance scanning with coccinelle flagged: ./drivers/net/wan/dscc4.c:1036:1-33: WARNING: timeout (10) seems HZ dependent ./drivers/net/wan/dscc4.c:554:2-34: WARNING: timeout (10) seems HZ dependent ./drivers/net/wan/dscc4.c:599:2-34: WARNING: timeout (10) seems HZ dependent Numeric constants passed to schedule_timeout_*() make the effective timeout HZ dependent which does not seem to be the intent here. Fixed up by converting the constant to jiffies with msecs_to_jiffies(), passing 100ms (assuming HZ==100 in the original code). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* cosa: use msecs_to_jiffies for conversionsNicholas Mc Guire2015-06-071-1/+1
| | | | | | | | | | | | | API compliance scanning with coccinelle flagged: ./drivers/net/wan/cosa.c:520:2-18: WARNING: timeout (30) seems HZ dependent Numeric constants passed to schedule_timeout() make the effective timeout HZ dependent which makes little sense in a device probe. Fixed up by converting the constant to jiffies with msecs_to_jiffies() Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5_core: Fix static checker warnings around system guid query flowMajd Dibbiny2015-06-073-3/+4
| | | | | | | | | Fix static checker warnings in the flow of system guid query. Fixes: 707c4602cda6 ('net/mlx5_core: Add new query HCA vport commands') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* isdn/hisax: Convert use of __constant_cpu_to_le16 to cpu_to_le16Vaishali Thakkar2015-06-071-2/+2
| | | | | | | | | | | | | | | | | | | In big endian cases, macro cpu_to_le16 unfolds to __swab16 which provides special case for constants. In little endian cases, __constant_cpu_to_le16 and cpu_to_le16 expand directly to the same expression. So, replace __constant_cpu_to_le16 with cpu_to_le16 with the goal of getting rid of the definition of __constant_cpu_to_le16 completely. The semantic patch that performs this transformation is as follows: @@expression x;@@ - __constant_cpu_to_le16(x) + cpu_to_le16(x) Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: micrel: use time_is_before_eq_jiffiesAntonio Murdaca2015-06-071-1/+1
| | | | | | | use time_is_before_eq_jiffies macro for time comparison Signed-off-by: Antonio Murdaca <antonio.murdaca@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fec: ptp: correct the ENET_ATCOR valueFugang Duan2015-06-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The current driver adjust freq formula is: fe * diff = ppb * pc Note: fe: ENET ref clock frequency in Hz diff = inc_corr - inc: difference between default increment and correction increment ppb: parts per billion adjustment from base pc: correction period (in number of fe clock cycles) The correction increment will be used after N cycles of regular increments, not every N cycles (with N being the correction period). For example, set ENET_ATCOR=4, INC=8, INC_CORR=9, there will be 4 increments of 8 (ENET_ATINC[INC]) , followed by 1 increment of 9 (ENET_ATINC[INC_CORR]). So, the correct formula is: fe * diff = ppb * (pc + 1) For ENET_ATCOR, a value 0 disables the correction counter and no corrections occur. So base on the origin formula, set pc = pc > 1 ? pc - 1 : pc. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ll_temac: Remove sparse warningsMichal Simek2015-06-071-6/+6
| | | | | | | | | | | | | | | | | | | Remove sparse warnings: drivers/net/ethernet/xilinx/ll_temac_main.c:65:16: warning: cast removes address space of expression drivers/net/ethernet/xilinx/ll_temac_main.c:70:9: warning: cast removes address space of expression drivers/net/ethernet/xilinx/ll_temac_main.c:127:16: warning: cast removes address space of expression drivers/net/ethernet/xilinx/ll_temac_main.c:137:9: warning: cast removes address space of expression drivers/net/ethernet/xilinx/ll_temac_main.c:409:3: warning: symbol 'temac_options' was not declared. Should it be static? drivers/net/ethernet/xilinx/ll_temac_main.c:590:6: warning: symbol 'temac_adjust_link' was not declared. Should it be static? Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: get_cookie_sock() consolidationEric Dumazet2015-06-073-23/+9
| | | | | | | | | | IPv4 and IPv6 share same implementation of get_cookie_sock(), and there is no point inlining it. We add tcp_ prefix to the common helper name and export it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bcmgenet: improve TX timeoutFlorian Fainelli2015-06-071-0/+67
| | | | | | | | | | | | | Dump useful ring statistics along with interrupt status, software maintained pointers and hardware registers to help troubleshoot TX queue stalls. When a timeout occurs, disable TX NAPI for the rings, dump their states while interrupts are disabled, re-enable interrupts, NAPI and queue flow control to help with the recovery. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: mv88e6xxx: Fix deadlock by double lockAndrew Lunn2015-06-071-2/+2
| | | | | | | | | | ethtool -S on a DSA interface can deadlock for some switches because the same lock is taken twice. Use the register read function which expects the lock to be already held. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: 31888234b736 ("net: dsa: mv88e6xxx: Replace stats mutex with SMI mutex") Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: allow programs to write to certain skb fieldsAlexei Starovoitov2015-06-076-48/+207
| | | | | | | | | | | | | | | | | | | allow programs read/write skb->mark, tc_index fields and ((struct qdisc_skb_cb *)cb)->data. mark and tc_index are generically useful in TC. cb[0]-cb[4] are primarily used to pass arguments from one program to another called via bpf_tail_call() which can be seen in sockex3_kern.c example. All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs. mark, tc_index are writeable from tc_cls_act only. cb[0]-cb[4] are writeable by both sockets and tc_cls_act. Add verifier tests and improve sample code. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: make programs see skb->data == L2 for ingress and egressAlexei Starovoitov2015-06-074-29/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data. For ingress L2 header is already pulled, whereas for egress it's present. This is known to program writers which are currently forced to use BPF_LL_OFF workaround. Since programs don't change skb internal pointers it is safe to do pull/push right around invocation of the program and earlier taps and later pt->func() will not be affected. Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick around run_filter/BPF_PROG_RUN even if skb_shared. This fix finally allows programs to use optimized LD_ABS/IND instructions without BPF_LL_OFF for higher performance. tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o w/o JIT w/JIT before 20.5 23.6 Mpps after 21.8 26.6 Mpps Old programs with BPF_LL_OFF will still work as-is. We can now undo most of the earlier workaround commit: a166151cbe33 ("bpf: fix bpf helpers to use skb->mac_header relative offsets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: remove redundant checks IIEric Dumazet2015-06-072-8/+0
| | | | | | | | For same reasons than in commit 12e25e1041d0 ("tcp: remove redundant checks"), we can remove redundant checks done for timewait sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* fddi: print an address with %p format specifier rather than %xColin Ian King2015-06-071-1/+1
| | | | | | | | The debug is printing the struct smt_header * address using the %x format specifier. Fix it to use %p instead. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* wan: dscc4: fix build warning Wunused-but-set-variableNicholas Mc Guire2015-06-071-3/+0
| | | | | | | | | | | | | Fix: drivers/net/wan/dscc4.c: In function 'dscc4_open': drivers/net/wan/dscc4.c:1049:25: warning: variable 'ppriv' set but not used [-Wunused-but-set-variable] This has been in there unused since 1da177e4c3f (Linux-2.6.12-rc2) simply remove it. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: add IP_BIND_ADDRESS_NO_PORT to overcome bind(0) limitationsEric Dumazet2015-06-065-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an application needs to force a source IP on an active TCP socket it has to use bind(IP, port=x). As most applications do not want to deal with already used ports, x is often set to 0, meaning the kernel is in charge to find an available port. But kernel does not know yet if this socket is going to be a listener or be connected. It has very limited choices (no full knowledge of final 4-tuple for a connect()) With limited ephemeral port range (about 32K ports), it is very easy to fill the space. This patch adds a new SOL_IP socket option, asking kernel to ignore the 0 port provided by application in bind(IP, port=0) and only remember the given IP address. The port will be automatically chosen at connect() time, in a way that allows sharing a source port as long as the 4-tuples are unique. This new feature is available for both IPv4 and IPv6 (Thanks Neal) Tested: Wrote a test program and checked its behavior on IPv4 and IPv6. strace(1) shows sequences of bind(IP=127.0.0.2, port=0) followed by connect(). Also getsockname() show that the port is still 0 right after bind() but properly allocated after connect(). socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5 setsockopt(5, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 connect(5, {sa_family=AF_INET, sin_port=htons(53174), sin_addr=inet_addr("127.0.0.3")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(38050), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 IPv6 test : socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 7 setsockopt(7, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 connect(7, {sa_family=AF_INET6, sin6_port=htons(57300), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(60964), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 I was able to bind()/connect() a million concurrent IPv4 sockets, instead of ~32000 before patch. lpaa23:~# ulimit -n 1000010 lpaa23:~# ./bind --connect --num-flows=1000000 & 1000000 sockets lpaa23:~# grep TCP /proc/net/sockstat TCP: inuse 2000063 orphan 0 tw 47 alloc 2000157 mem 66 Check that a given source port is indeed used by many different connections : lpaa23:~# ss -t src :40000 | head -10 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 0 127.0.0.2:40000 127.0.202.33:44983 ESTAB 0 0 127.0.0.2:40000 127.2.27.240:44983 ESTAB 0 0 127.0.0.2:40000 127.2.98.5:44983 ESTAB 0 0 127.0.0.2:40000 127.0.124.196:44983 ESTAB 0 0 127.0.0.2:40000 127.2.139.38:44983 ESTAB 0 0 127.0.0.2:40000 127.1.59.80:44983 ESTAB 0 0 127.0.0.2:40000 127.3.6.228:44983 ESTAB 0 0 127.0.0.2:40000 127.0.38.53:44983 ESTAB 0 0 127.0.0.2:40000 127.1.197.10:44983 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Fix static checker warningHariprasad Shenai2015-06-051-1/+1
| | | | | | | | | | | | | | | The patch e85c9a7abfa4: ("cxgb4/cxgb4vf: Add code to calculate T5 BAR2 Offsets for SGE Queue Registers") from Dec 3, 2014, leads to the following static checker warning: drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:5358 t4_bar2_sge_qregs() warn: should '(qid >> qpp_shift) << page_shift' be a 64 bit type? This patch fixes it Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'cxgb4-next'David S. Miller2015-06-055-51/+257
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hariprasad Shenai says: ==================== Free VI, flush sge ec and some other misc. fixes This patch series adds the following. Free VI interface during remove, flush SGE ec routine, rename t4_link_start to t4_link_l1cfg since it only does l1 configuration, set mac addr from when we can't contact firmware for debug purpose, set pcie completion timeout and use fw interface to access TP_PIO_XXX registers This patch series has been created against net-next tree and includes patches on cxgb4 driver. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Use FW LDST cmd to access TP_PIO_{ADDR, DATA} register firstHariprasad Shenai2015-06-052-21/+83
| | | | | | | | | | | | | | | | | | | | | | The TP_PIO_{ADDR,DATA} registers are are in conflict with the firmware's use of these registers. Added a routine to access it through FW LDST cmd. Access all TP_PIO_{ADDR,DATA} register access through new routine if FW is alive. If firmware is dead, than fall back to indirect access. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: program pci completion timeoutHariprasad Shenai2015-06-051-0/+19
| | | | | | | | | | | | | | Set pci completion timeout to 0xd. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Set mac addr from vpd, when we can't contact firmwareHariprasad Shenai2015-06-053-19/+81
| | | | | | | | | | | | | | | | Grab the Adapter MAC Address out of the VPD and use it for the "debug" network interface when either we can't contact the firmware Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Rename t4_link_start() to t4_link_l1cfgHariprasad Shenai2015-06-054-6/+6
| | | | | | | | | | | | | | | | | | | | | | t4_link_start() was completely misnamed. It does _not_ start up the link. It merely does the L1 Configuration for the link. The Link Up process is started automatically by the firmware when the number of enabled Virtual Interfaces on a port goes from 0 to 1. So renaming this routine to t4_link_l1cfg() for better documentation. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Add sge ec context flush serviceHariprasad Shenai2015-06-054-5/+33
| | | | | | | | | | | | | | | | Add function to flush the sge ec context cache, and utilize this new function in the driver Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Free Virtual Interfaces in remove routineHariprasad Shenai2015-06-053-0/+35
|/ | | | | | | | Free VI interfaces in remove routine. If we don't do this then the firmware will never drop the physical link to the peer. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mlx5-next'David S. Miller2015-06-0424-337/+1800
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Or Gerlitz says: ==================== mlx5: Add Interface Step Sequence ID support ISSI (Interface Step Sequence ID) defines the step sequence ID of the interface between the driver to the firmware and is incremented by steps of one. ISSI is used to enable deprecating/modifying features, command interfaces and such, while maintaining compatibility. As the driver serves both ConnectIB (CIB) and ConnectX4, we carefully made sure that the IB functionality keeps running also on older CIB firmware releases that don't support ISSI. The Ethernet functionailty is available only on ConnectX4 where all firmware releases support the feature since the very basic ISSI level. So at this point no need for compatility code there. As done prior to this series, when the Ethernet functionlity is enabled, during the initialization flow, the core driver performs a query of the supported ISSIs using the QUERY_ISSI command, and then, if ISSI is supported, sets the actual issi value informing the firmware on which ISSI level to run, using SET_ISSI command. Previously, the IB driver wasn't ready to work on that mode, and hence building both the IB driver and the Ethernet functionality in the core driver were disallowed by Kconfigs, with this series, we allow users to enable them both. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlx5: Enable mutual support for IB and EthernetHaggai Abramonvsky2015-06-045-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ethernet functionality is only available when working in ISSI > 0 mode. Previously, the IB driver wasn't ready to work on that mode, and hence building both the IB driver and the Ethernet functionality in the core driver were disallowed by Kconfigs. Now, once we have all the pre-steps in place, we can remove this limitation. The last steps in the IB driver for getting that setup to work are: create dummy SRQ for the driver's use (until now we could use XRC_SRQ as SRQ and XRC_SRQ, after moving to ISSI > 0, we separate XRC SRQs from basic SRQs) and adapt the create QP function to be compatible with ISSI > 0. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * IB/mlx5: Don't create IB instance over Ethernet portsMajd Dibbiny2015-06-041-0/+4
| | | | | | | | | | | | | | | | | | Since we still don't have RoCE support in mlx5, avoid creating IB driver instance over Ethernet ports. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * IB/mlx5: Avoid using the MAD_IFC command under ISSI > 0 modeMajd Dibbiny2015-06-043-173/+657
| | | | | | | | | | | | | | | | | | | | In ISSI > 0 mode, most of the MAD_IFC command features are deprecated, and can't be used. Therefore, when in that mode, we replace all of them with other commands that provide the required functionality. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Add more query port helpersMajd Dibbiny2015-06-042-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the following helpers: 1. mlx5_query_port_proto_oper -- queries the port speed port mask 2. mlx5_query_port_link_width_oper - queries the port link with bitmask 3. mlx5_query_port_vl_hw_cap - queries the Virtual Lanes supported on this port These helpers will be used from the IB driver when working in ISSI > 0 mode. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Use port number when querying port ptysMajd Dibbiny2015-06-043-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Until now, mlx5_query_port_ptys always queried port number one. Added new argument in the function's prototype so we can also query the second port. This will be needed when thr helper will be invoked from the IB driver on non FPP (Function-Per-Port) devices. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Use port number in the query port mtu helpersMajd Dibbiny2015-06-043-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | Extend the function prototypes for max and operational mtu to take the local port number. In the Ethernet driver is this hard coded to one, since ConnectX4 Ethernet devices are always function-per-port. The IB driver also serves older devices (ConnectIB) which isn't such, and hence the part can vary. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Get vendor-id using the query adapter commandMajd Dibbiny2015-06-045-20/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | Add two wrapper functions to the query adapter command: 1. mlx5_query_board_id -- replaces the old mlx5_cmd_query_adapter. 2. mlx5_core_query_vendor_id -- retrieves the vendor_id from the query_adapter command. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Add new query HCA vport commandsMajd Dibbiny2015-06-047-19/+349
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Added the implementation for the following commands: 1. QUERY_HCA_VPORT_GID 2. QUERY_HCA_VPORT_PKEY 3. QUERY_HCA_VPORT_CONTEXT They will be needed when we move to work with ISSI > 0 in the IB driver too. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Make the vport helpers available for the IB driver tooMajd Dibbiny2015-06-044-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | Move the vport header file to be under include/linux/mlx5, such that the mlx5 IB can use it as well. Also add nic_ prefix to the vport NIC commands to differeniate between HCA vport commands and NIC vport commands. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Check the return bitmask when querying ISSIHaggai Abramonvsky2015-06-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | The determination of the supported ISSI versions should be conditioned on the returned mask, and not only on the return status of the query ISSI command, fix that. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0Haggai Abramonvsky2015-06-047-73/+564
| | | | | | | | | | | | | | | | | | | | | | | | When working in ISSI > 0 mode, the model exposed by the device for XRCs and SRQs is different. XRCs use XRC SRQs and plain SRQs are based on RPM (Receive Memory Pool). Add helper functions to create, modify, query, and arm XRC SRQs and RMPs. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_core: Apply proper name convention to helpersHaggai Abramonvsky2015-06-043-30/+36
| | | | | | | | | | | | | | | | | | Some core helper functions were named with mlx5_ only prefix, fix that to mlx5_core_ so we're aligned with the overall scheme used for core services. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5_en: Add missing check for memory allocation failureAmir Vadai2015-06-041-0/+2
|/ | | | | | | | | | | | | | The patch afb736e9330a: "net/mlx5: Ethernet resource handling files" from May 28, 2015, leads to the following static checker warning: drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c:726 mlx5e_create_main_flow_table() error: potential null dereference 'g'. (kcalloc returns null) Fixes: afb736e9330a ("net/mlx5: Ethernet resource handling files") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'flow_key_hashing'David S. Miller2015-06-0412-135/+387
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tom Herbert says: ==================== net: Increase inputs to flow_keys hashing This patch set adds new fields to the flow_keys structure and hashes over these fields to get a better flow hash. In particular, these patches now include hashing over the full IPv6 addresses in order to defend against address spoofing that always results in the same hash. The new input also includes the Ethertype, L4 protocol, VLAN, flow label, GRE keyid, and MPLS entropy label. In order to increase hash inputs, we switch to using jhash2 which operates an an array of u32's. jhash2 operates on multiples of three words. The data in the hash is constructed for that, and there are are two variants for IPv4 and Ipv6 addressing. For IPv4 addresses, jhash is performed over six u32's and for IPv6 it is done over twelve. flow_keys can store either IPv4 or IPv6 addresses (addr_proto field is a selector). ipv6_addr_hash is no longer used to convert addresses for setting in flow table. For legacy uses of flow keys outside of flow_dissector the flow_get_u32_src and flow_get_u32_dst functions have been added to get u32 representation representations of addresses in flow_keys. For flow lables we also eliminate the short circuit in flow_dissector for non-zero flow label. The flow label is now considered additional input to ports. Testing: Ran netperf TCP_RR for 200 flows using IPv4 and IPv6 comparing before the patches and with the patches. Did not detect any performance degradation. v2: - Took out MPLS entropy label. Will add this later. v3: - Ensure hash start offset is a four byte boundary. Add BUG_BUILD_ON to check for this. - Fixes sparse error in GRE to get entropy from keyid. v4: - Rebase to Jiri changes to generalize flow dissection - Support TIPC as its own address - Bring back MPLS entropy label dissection - Remove FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS v5: - Minor fixes from feedback v6: - Cleanup and sparse issue with flow label - Change keyid to returned by flow_dissector to be __be32 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mpls: Add MPLS entropy label in flow_keysTom Herbert2015-06-042-0/+36
| | | | | | | | | | | | | | | | | | In flow dissector if an MPLS header contains an entropy label this is saved in the new keyid field of flow_keys. The entropy label is then represented in the flow hash function input. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add GRE keyid in flow_keysTom Herbert2015-06-042-1/+29
| | | | | | | | | | | | | | | | | | In flow dissector if a GRE header contains a keyid this is saved in the new keyid field of flow_keys. The GRE keyid is then represented in the flow hash function input. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add IPv6 flow label to flow_keysTom Herbert2015-06-042-20/+13
| | | | | | | | | | | | | | | | | | | | In flow_dissector set the flow label in flow_keys for IPv6. This also removes the shortcircuiting of flow dissection when a non-zero label is present, the flow label can be considered to provide additional entropy for a hash. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add VLAN ID to flow_keysTom Herbert2015-06-042-0/+20
| | | | | | | | | | | | | | In flow_dissector set vlan_id in flow_keys when VLAN is found. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Get rid of IPv6 hash addresses flow keysTom Herbert2015-06-042-18/+0
| | | | | | | | | | | | | | | | | | | | | | We don't need to return the IPv6 address hash as part of flow keys. In general, using the IPv6 address hash is risky in a hash value since the underlying use of xor provides no entropy. If someone really needs the hash value they can get it from the full IPv6 addresses in flow keys (e.g. from flow_get_u32_src). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add keys for TIPC addressTom Herbert2015-06-042-5/+23
| | | | | | | | | | | | | | Add a new flow key for TIPC addresses. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add full IPv6 addresses to flow_keysTom Herbert2015-06-0410-63/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds full IPv6 addresses into flow_keys and uses them as input to the flow hash function. The implementation supports either IPv4 or IPv6 addresses in a union, and selector is used to determine how may words to input to jhash2. We also add flow_get_u32_dst and flow_get_u32_src functions which are used to get a u32 representation of the source and destination addresses. For IPv6, ipv6_addr_hash is called. These functions retain getting the legacy values of src and dst in flow_keys. With this patch, Ethertype and IP protocol are now included in the flow hash input. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Get skb hash over flow_keys structureTom Herbert2015-06-046-17/+66
| | | | | | | | | | | | | | | | | | | | | | | | This patch changes flow hashing to use jhash2 over the flow_keys structure instead just doing jhash_3words over src, dst, and ports. This method will allow us take more input into the hashing function so that we can include full IPv6 addresses, VLAN, flow labels etc. without needing to resort to xor'ing which makes for a poor hash. Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Remove superfluous setting of key_basicTom Herbert2015-06-041-6/+0
| | | | | | | | | | | | | | | | | | key_basic is set twice in __skb_flow_dissect which seems unnecessary. Remove second one. Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mpls: Add definition for IPPROTO_MPLSTom Herbert2015-06-041-0/+2
| | | | | | | | | | | | | | | | Add uapi define for MPLS over IP. Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud