summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ipv6: Compare addresses only bits up to the prefix length (RFC6724).YOSHIFUJI Hideaki / 吉藤英明2012-09-131-2/+4
| | | | | | | | Compare bits up to the source address's prefix length only to allows DNS load balancing to continue to be used as a tie breaker. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Add labels for site-local and 6bone testing addresses (RFC6724)YOSHIFUJI Hideaki / 吉藤英明2012-09-131-1/+13
| | | | | | | | | | | | Added labels for site-local addresses (fec0::/10) and 6bone testing addresses (3ffe::/16) in order to depreference them. Note that the RFC introduced new rows for Teredo, ULA and 6to4 addresses in the default policy table. Some of them have different labels from ours. For backward compatibility, we do not change the "default" labels. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* scsi_netlink: Remove dead and buggy codeEric W. Biederman2012-09-132-564/+15
| | | | | | | | | | | | | | | | | | | | | | The scsi netlink code confuses the netlink port id with a process id, going so far as to read NETLINK_CREDS(skb)->pid instead of the correct NETLINK_CB(skb).pid. Fortunately it does not matter because nothing registers to respond to scsi netlink requests. The only interesting use of the scsi_netlink interface is fc_host_post_vendor_event which sends a netlink multicast message. Since nothing registers to handle scsi netlink messages kill all of the registration logic, while retaining the same error handling behavior preserving the userspace visible behavior and removing all of the confused code that thought a netlink port id was a process id. This was tested with a kernel allyesconfig build which had no problems. Cc: James Bottomley <James.Bottomley@parallels.com> Cc: James Smart <James.Smart@Emulex.Com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netprio_cgroup: Use memcpy instead of the for-loop to copy priomapSrivatsa S. Bhat2012-09-131-5/+4
| | | | | | | Replace the current (inefficient) for-loop with memcpy, to copy priomap. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netprio_cgroup: Remove update_netdev_tables() since it is unnecessarySrivatsa S. Bhat2012-09-131-32/+0
| | | | | | | | | The update_netdev_tables() function appears to be unnecessary, since the write_update_netdev_table() function will adjust the priomaps as and when required anyway. So drop the usage of update_netdev_tables() entirely. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: Utilize Link Flap AvoidanceYuval Mintz2012-09-136-26/+51
| | | | | | | | | | | | | Change various flows in the bnx2x driver which up until now flapped the link - these flows now benefit from the link flap avoidance mechanism. This includes the removal of the link reset made upon nic init, as it is possible the link is already active at that time. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: Link Flap AvoidanceYaniv Rosner2012-09-133-48/+437
| | | | | | | | | | | | | | | | | | Various flows in the bnx2x driver cause a link-flap - if the link is up, it would be toggled down (after a mac/phy reset) and then taken back up. In many of these cases, there is no need to do cause such a flap, as the associated flows should not actually affect the link. This patch adds the 'Link Flap Avoidance' mechanism, which allows the driver to better determine if a given flow requires a link change, and thus minimize the number of link flaps caused by the driver. Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: link code refactoringYaniv Rosner2012-09-132-79/+114
| | | | | | | | | | | | Separate the interrupt setting part of each external PHY to a specific function. This allows calling the interrupt setting in case of link-flap avoidance, since some link owners may not enable the interrupt on their own. Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller2012-09-135-9/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== The following patchset contains four Netfilter updates, mostly targeting to fix issues added with IPv6 NAT, and one little IPVS update for net-next: * Remove unneeded conditional free of skb in nfnetlink_queue, from Wei Yongjun. * One semantic path from coccinelle detected the use of list_del + INIT_LIST_HEAD, instead of list_del_init, again from Wei Yongjun. * Fix out-of-bound memory access in the NAT address selection, from Florian Westphal. This was introduced with the IPv6 NAT patches. * Two fixes for crashes that were introduced in the recently merged IPv6 NAT support, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: ctnetlink: fix module auto-load in ctnetlink_parse_natPablo Neira Ayuso2012-09-121-3/+0
| | | | | | | | | | | | | | | | | | | | (c7232c9 netfilter: add protocol independent NAT core) added incorrect locking for the module auto-load case in ctnetlink_parse_nat. That function is always called from ctnetlink_create_conntrack which requires no locking. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * ipvs: use list_del_init instead of list_del/INIT_LIST_HEADWei Yongjun2012-09-101-2/+1
| | | | | | | | | | | | | | | | | | | | | | Using list_del_init() instead of list_del() + INIT_LIST_HEAD(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nfnetlink_queue: remove pointless conditional before kfree_skb()Wei Yongjun2012-09-091-2/+1
| | | | | | | | | | | | | | Remove pointless conditional before kfree_skb(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_nat: fix out-of-bounds access in address selectionFlorian Westphal2012-09-091-1/+1
| | | | | | | | | | | | | | | | include/linux/jhash.h:138:16: warning: array subscript is above array bounds [jhash2() expects the number of u32 in the key] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: fix crash during boot if NAT has been compiled built-inPablo Neira Ayuso2012-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (c7232c9 netfilter: add protocol independent NAT core) introduced a problem that leads to crashing during boot due to NULL pointer dereference. It seems that xt_nat calls xt_register_target() before xt_init(): net/netfilter/x_tables.c:static struct xt_af *xt; is NULL and we crash on xt_register_target(struct xt_target *target) { u_int8_t af = target->family; int ret; ret = mutex_lock_interruptible(&xt[af].mutex); ... Fix this by changing the linking order, to make sure that x_tables comes before xt_nat. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | x86 bpf_jit: support MOD operationEric Dumazet2012-09-101-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | commit b6069a9570 (filter: add MOD operation) added generic support for modulus operation in BPF. This patch brings JIT support for x86_64 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: George Bakos <gbakos@alpinista.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: use native EEE instead of auto-greeenYuval Mintz2012-09-103-46/+73
| | | | | | | | | | | | | | | | | | This patch enables boards with 54618SE phys and a sufficiently new firmware to use native EEE instead of auto-greeen. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: correct & clean 10G EEE requirementsYuval Mintz2012-09-103-6/+5
| | | | | | | | | | | | Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: EEE code refactoringYuval Mintz2012-09-101-192/+258
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to lay the foundation for 1G EEE support, several segments of code which are common to both 1G and 10G EEE configurations were extracted from the 10G EEE configuration flow to their own functions. E.g., bnx2x_eee_initial_config, bnx2x_eee_advertise, bnx2x_eee_disable, etc. The rest of the EEE functions were relocated and placed in a single, continuous section of the file. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: add EEE support for 4-port devicesYuval Mintz2012-09-101-2/+0
| | | | | | | | | | | | | | | | Prevent functions from disabling EEE to other functions using other ports. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: EEE status is read locallyYuval Mintz2012-09-103-7/+36
| | | | | | | | | | | | | | | | | | | | This patch aligns the EEE status with that of all other link properties, by changing the way its accessed - instead of a direct read to the shared memory, each function maintain its own copy locally. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: remove some useless RCU read lockAmerigo Wang2012-09-102-24/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After this commit: commit 97cac0821af4474ec4ba3a9e7a36b98ed9b6db88 Author: David S. Miller <davem@davemloft.net> Date: Mon Jul 2 22:43:47 2012 -0700 ipv6: Store route neighbour in rt6_info struct. we no longer use RCU to protect route neighbour. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | etherdevice: introduce help function eth_zero_addr()Duan Jiong2012-09-101-0/+11
| | | | | | | | | | | | | | | | | | | | | | a lot of code has either the memset or an inefficient copy from a static array that contains the all-zeros Ethernet address. Introduce help function eth_zero_addr() to fill an address with all zeros, making the code clearer and allowing us to get rid of some constant arrays. Signed-off-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cnic: Allocate UIO resources only on devices that support iSCSI.Michael Chan2012-09-102-3/+6
| | | | | | | | | | | | | | | | | | Update version to 2.5.13. Reviewed-by: Eddie Wai <eddie.wai@broadcom.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cnic: Allocate kcq resource only on devices that support FCoE.Michael Chan2012-09-102-4/+7
| | | | | | | | | | | | | | | | | | | | To save memory and to exit IRQ loop quicker on devices that don't support FCoE. Reviewed-by: Eddie Wai <eddie.wai@broadcom.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cnic: Add function pointers to arm IRQ for different devices.Michael Chan2012-09-102-4/+23
| | | | | | | | | | | | | | | | | | | | This will make it easier to exit IRQ loop and re-arm IRQ on devices that don't support FCoE. Reviewed-by: Eddie Wai <eddie.wai@broadcom.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cnic: Free UIO rings when the device is closed.Michael Chan2012-09-101-0/+7
| | | | | | | | | | | | | | | | | | This will free up unneeded memory. Reviewed-by: Eddie Wai <eddie.wai@broadcom.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cnic: Add functions to allocate and free UIO ringsMichael Chan2012-09-101-19/+40
| | | | | | | | | | | | | | | | | | | | These functions are needed to free up memory when the rings are no longer needed. Reviewed-by: Eddie Wai <eddie.wai@broadcom.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | filter: add MOD operationEric Dumazet2012-09-102-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new ALU opcode, to compute a modulus. Commit ffe06c17afbbb used an ancillary to implement XOR_X, but here we reserve one of the available ALU opcode to implement both MOD_X and MOD_K Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: George Bakos <gbakos@alpinista.org> Cc: Jay Schulist <jschlst@samba.org> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: Report user triggered expirations against the users socketEric W. Biederman2012-09-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a policy expiration is triggered from user space the request travels through km_policy_expired and ultimately into xfrm_exp_policy_notify which calls build_polexpire. build_polexpire uses the netlink port passed to km_policy_expired as the source port for the netlink message it builds. When a state expiration is triggered from user space the request travles through km_state_expired and ultimately into xfrm_exp_state_notify which calls build_expire. build_expire uses the netlink port passed to km_state_expired as the source port for the netlink message it builds. Pass nlh->nlmsg_pid from the user generated netlink message that requested the expiration to km_policy_expired and km_state_expired instead of current->pid which is not a netlink port number. Cc: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: Rename pid to portid to avoid confusionEric W. Biederman2012-09-1075-728/+728
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: hide struct module parameter in netlink_kernel_createPablo Neira Ayuso2012-09-0820-37/+31
| | | | | | | | | | | | | | | | | | | | | | This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: kill netlink_set_nonrootPablo Neira Ayuso2012-09-086-25/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace netlink_set_nonroot by one new field `flags' in struct netlink_kernel_cfg that is passed to netlink_kernel_create. This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since now the flags field in nl_table is generic (so we can add more flags if needed in the future). Also adjust all callers in the net-next tree to use these flags instead of netlink_set_nonroot. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netdev/phy: mdio-mux-mmioreg.c should include of_address.hTimur Tabi2012-09-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | mdio-mux-mmioreg.c uses function of_address_to_resource(), which is defined in linux/of_address.h. This fixes a compilation error: drivers/net/phy/mdio-mux-mmioreg.c: In function 'mdio_mux_mmioreg_probe': drivers/net/phy/mdio-mux-mmioreg.c:83:2: error: implicit declaration of function 'of_address_to_resource' Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: rt_cache_flush() cleanupEric Dumazet2012-09-071-17/+2
| | | | | | | | | | | | | | | | | | We dont use jhash anymore since route cache removal, so we can get rid of get_random_bytes() calls for rt_genid changes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: qmi_wwan: use a single bind function for all device typesBjørn Mork2012-09-071-29/+16
| | | | | | | | | | | | | | | | | | | | | | Refactoring the bind code lets us use a common driver_info struct for all supported devices, simplifying the code a bit. The real advantage is that devices using the CDC ECM interface layout now also can be added dynamically using the new_id sysfs interface. This simplifies testing of new devices. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: qmi_wwan: increase max QMI message size to 4096Bjørn Mork2012-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QMI requests exceeding 1500 bytes are possible and device firmware does not handle fragmented messages very well. It is therefore necessary to increase the maximum message size from the current 512 bytes. The protocol message size limit is not documented in any publicly known source, but the out of tree driver from CodeAurora use 4 kB. This is therefore chosen as the new arbitrary default until the real limit is known. This should allow any QMI message to be transmitted without fragmentation, fixing known issues with GPS assistance data upload. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4/route: arg delay is useless in rt_cache_flush()Nicolas Dichtel2012-09-077-35/+22
| | | | | | | | | | | | | | | | | | Since route cache deletion (89aef8921bfbac22f), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | scm: Don't use struct ucred in NETLINK_CB and struct scm_cookie.Eric W. Biederman2012-09-074-12/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Passing uids and gids on NETLINK_CB from a process in one user namespace to a process in another user namespace can result in the wrong uid or gid being presented to userspace. Avoid that problem by passing kuids and kgids instead. - define struct scm_creds for use in scm_cookie and netlink_skb_parms that holds uid and gid information in kuid_t and kgid_t. - Modify scm_set_cred to fill out scm_creds by heand instead of using cred_to_ucred to fill out struct ucred. This conversion ensures userspace does not get incorrect uid or gid values to look at. - Modify scm_recv to convert from struct scm_creds to struct ucred before copying credential values to userspace. - Modify __scm_send to populate struct scm_creds on in the scm_cookie, instead of just copying struct ucred from userspace. - Modify netlink_sendmsg to copy scm_creds instead of struct ucred into the NETLINK_CB. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | igmp: avoid drop_monitor false positivesEric Dumazet2012-09-071-11/+19
| | | | | | | | | | | | | | | | | | igmp should call consume_skb() for all correctly processed packets, to avoid false dropwatch/drop_monitor false positives. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | drivers/net/usb/sierra_net.c: removes unnecessary semicolonPeter Senna Tschudin2012-09-071-1/+1
| | | | | | | | | | | | | | | | | | removes unnecessary semicolon Found by Coccinelle: http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: fix handling of throw routesNicolas Dichtel2012-09-071-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's the same problem that previous fix about blackhole and prohibit routes. When adding a throw route, it was handled like a classic route. Moreover, it was only possible to add this kind of routes by specifying an interface. Before the patch: $ ip route add throw 2001::2/128 RTNETLINK answers: No such device $ ip route add throw 2001::2/128 dev eth0 $ ip -6 route | grep 2001::2 2001::2 dev eth0 metric 1024 After: $ ip route add throw 2001::2/128 $ ip -6 route | grep 2001::2 throw 2001::2 dev lo metric 1024 error -11 Reported-by: Markus Stenberg <markus.stenberg@iki.fi> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: fix TFO regressionEric Dumazet2012-09-062-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fengguang Wu reported various panics and bisected to commit 8336886f786fdac (tcp: TCP Fast Open Server - support TFO listeners) Fix this by making sure socket is a TCP socket before accessing TFO data structures. [ 233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h. [ 233.047399] ------------[ cut here ]------------ [ 233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074! [ 233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 233.048393] Modules linked in: [ 233.048393] CPU 0 [ 233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+ #4192 Bochs Bochs [ 233.048393] RIP: 0010:[<ffffffff81169653>] [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP: 0018:ffff88000facbca8 EFLAGS: 00010092 [ 233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX: 00000000a189a188 [ 233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI: ffffffff810d30f9 [ 233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09: ffffffff843846f0 [ 233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12: 0000000000000202 [ 233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15: ffffffff8363c780 [ 233.048393] FS: 00007faa6899c700(0000) GS:ffff88001f200000(0000) knlGS:0000000000000000 [ 233.048393] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4: 00000000000006f0 [ 233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 233.048393] Process trinity-watchdo (pid: 3929, threadinfo ffff88000faca000, task ffff88000faec600) [ 233.048393] Stack: [ 233.048393] 0000000000000000 0000ea6000000bb8 ffff88000facbce8 ffffffff8116ad81 [ 233.048393] ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0 0000000000000000 [ 233.048393] ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0 ffff88000ff58850 [ 233.048393] Call Trace: [ 233.048393] [<ffffffff8116ad81>] kfree+0x5f/0xca [ 233.048393] [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c [ 233.048393] [<ffffffff823dbcb0>] ? inet_sk_rebuild_header +0x319/0x319 [ 233.048393] [<ffffffff8231c307>] __sk_free+0x21/0x14b [ 233.048393] [<ffffffff8231c4bd>] sk_free+0x26/0x2a [ 233.048393] [<ffffffff825372db>] sctp_close+0x215/0x224 [ 233.048393] [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9 [ 233.048393] [<ffffffff823daf12>] inet_release+0x7e/0x85 [ 233.048393] [<ffffffff82317d15>] sock_release+0x1f/0x77 [ 233.048393] [<ffffffff82317d94>] sock_close+0x27/0x2b [ 233.048393] [<ffffffff81173bbe>] __fput+0x101/0x20a [ 233.048393] [<ffffffff81173cd5>] ____fput+0xe/0x10 [ 233.048393] [<ffffffff810a3794>] task_work_run+0x5d/0x75 [ 233.048393] [<ffffffff8108da70>] do_exit+0x290/0x7f5 [ 233.048393] [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b [ 233.048393] [<ffffffff8108e23f>] do_group_exit+0x7b/0xba [ 233.048393] [<ffffffff8108e295>] sys_exit_group+0x17/0x17 [ 233.048393] [<ffffffff8270de10>] tracesys+0xdd/0xe2 [ 233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48 89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f 57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00 [ 233.048393] RIP [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP <ffff88000facbca8> Reported-by: Fengguang Wu <wfg@linux.intel.com> Tested-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "H.K. Jerry Chu" <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: fix handling of blackhole and prohibit routesNicolas Dichtel2012-09-052-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a blackhole or a prohibit route, they were handling like classic routes. Moreover, it was only possible to add this kind of routes by specifying an interface. Bug already reported here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498 Before the patch: $ ip route add blackhole 2001::1/128 RTNETLINK answers: No such device $ ip route add blackhole 2001::1/128 dev eth0 $ ip -6 route | grep 2001 2001::1 dev eth0 metric 1024 After: $ ip route add blackhole 2001::1/128 $ ip -6 route | grep 2001 blackhole 2001::1 dev lo metric 1024 error -22 v2: wrong patch v3: add a field fc_type in struct fib6_config to store RTN_* type Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | be2net: use PCIe AER capabilitySathya Perla2012-09-051-0/+8
| | | | | | | | | | | | | | | | | | | | This patch allows code to handle the PCIe AER capability. The PCI callbacks for error handling/reset/recovery already exist in be2net and have been tested with EEH/ppc. This patch has been tested using the aer-inject tool. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: qdisc busylock needs lockdep annotationsEric Dumazet2012-09-053-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems we need to provide ability for stacked devices to use specific lock_class_key for sch->busylock We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but a user might use a qdisc anyway. (So same fixes are probably needed on non LLTX stacked drivers) Noticed while stressing L2TPV3 setup : ====================================================== [ INFO: possible circular locking dependency detected ] 3.6.0-rc3+ #788 Not tainted ------------------------------------------------------- netperf/4660 is trying to acquire lock: (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] but task is already holding lock: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&sch->busylock)->rlock){+.-...}: [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60 [<ffffffff81074872>] __wake_up+0x32/0x70 [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80 [<ffffffff81378fb3>] pty_write+0x73/0x80 [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40 [<ffffffff813722b2>] process_echoes+0x142/0x330 [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230 [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0 [<ffffffff81062818>] process_one_work+0x198/0x760 [<ffffffff81063236>] worker_thread+0x186/0x4b0 [<ffffffff810694d3>] kthread+0x93/0xa0 [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10 -> #0 (l2tpsock){+.-...}: [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&sch->busylock)->rlock); lock(l2tpsock); lock(&(&sch->busylock)->rlock); lock(l2tpsock); *** DEADLOCK *** 5 locks held by netperf/4660: #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00 #4: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 stack backtrace: Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788 Call Trace: [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0 [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70 [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0 [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0 [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0 [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0 [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: use list_move_tail instead of list_del/list_add_tailWei Yongjun2012-09-051-2/+1
| | | | | | | | | | | | | | | | | | | | Using list_move_tail() instead of list_del() + list_add_tail(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: ipv6: using csum_ipv6_magic requires net/ip6_checksum.hStephen Rothwell2012-09-051-0/+1
| | | | | | | | | | | | | | | | | | | | Fixes this build error: net/ipv6/netfilter/nf_nat_l3proto_ipv6.c: In function 'nf_nat_ipv6_csum_recalc': net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:144:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration] Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Remove duplicate register definitionsVipul Pandya2012-09-054-58/+30
| | | | | | | | | | | | | | | | | | | | | | Removed duplicate definition for SGE_PF_KDOORBELL, SGE_INT_ENABLE3, PCIE_MEM_ACCESS_OFFSET registers. Moved the register field definitions around the register definition. Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Reviewed-by: Sivakumar Subramani <sivasu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDMA/cxgb4: Update RDMA/cxgb4 due to macro definition removal in cxgb4 driverVipul Pandya2012-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | cxgb4 driver has duplicate definitions of registers which will be removed. This patch updates the RDMA/cxgb4 driver accordingly. Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Reviewed-by: Sivakumar Subramani <sivasu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add unknown state to sysfs NIC duplex exportNikolay Aleksandrov2012-09-051-3/+15
| | | | | | | | | | | | | | | | | | | | Currently when the NIC duplex state is DUPLEX_UNKNOWN it is exported as full through sysfs, this patch adds support for DUPLEX_UNKNOWN. It is handled the same way as in ethtool. Signed-off-by: Nikolay Aleksandrov <naleksan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud