summaryrefslogtreecommitdiffstats
path: root/drivers/net/bonding
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-07-031-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/renesas/sh_eth.c net/ipv4/gre.c The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list) and the splitting of the gre.c code into seperate files. The FEC conflict was two sets of changes adding ethtool support code in an "!CONFIG_M5272" CPP protected block. Finally the sh_eth.c conflict was between one commit add bits set in the .eesr_err_check mask whilst another commit removed the .tx_error_check member and assignments. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: fix slave speed reporting in bond_miimon_commitNikolay Aleksandrov2013-06-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | When we have BOND_LINK_UP the speed is reported unconditionally with %u format although it can be SPEED_UNKNOWN (-1). After this patch it returns 0 in that case in an attempt to keep the existing scripts happy. One line is intenionally left 81 chars because it gets ugly if broken. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: combine pr_debugs in bond_set_dev_addr into oneNikolay Aleksandrov2013-06-291-3/+2
| | | | | | | | | | | | | | Combine the multiple pr_debugs in bond_set_dev_addr into one pr_debug. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: when cloning a MAC use NET_ADDR_STOLENnikolay@redhat.com2013-06-271-14/+19
| | | | | | | | | | | | | | | | | | | | | | A simple semantic change, when a slave's MAC is cloned by the bond master then set addr_assign_type to NET_ADDR_STOLEN instead of NET_ADDR_SET. Also use bond_set_dev_addr() in BOND_FOM_ACTIVE mode to change the bond's MAC address because the assign_type has to be set properly. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: remove unnecessary dev_addr_from_first membernikolay@redhat.com2013-06-272-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In struct bonding there's a member called dev_addr_from_first which is used to denote when the bond dev should clone the first slave's MAC address but since we have netdev's addr_assign_type variable that is not necessary. We clone the first slave's MAC each time we have a random MAC set to the bond device. This has the nice side-effect of also fixing an inconsistency - when the MAC address of the bond dev is set after its creation, but prior to having slaves, it's not kept and the first slave's MAC is cloned. The only way to keep the MAC was to create the bond device with the MAC address set (e.g. through ip link). In all cases if the bond device is left without any slaves - its MAC gets reset to a random one as before. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: remove unnecessary setup_by_slave membernikolay@redhat.com2013-06-272-5/+1
| | | | | | | | | | | | | | | | | | | | We have a member called setup_by_slave in struct bonding to denote if the bond dev has different type than ARPHRD_ETHER, but that is already denoted in bond's netdev type variable if it was setup by the slave, so use that instead of the member. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: add an option to fail when any of arp_ip_target is inaccessibleVeaceslav Falico2013-06-253-14/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we fail only when all of the ips in arp_ip_target are gone. However, in some situations we might need to fail if even one host from arp_ip_target becomes unavailable. All situations, obviously, rely on the idea that we need *completely* functional network, with all interfaces/addresses working correctly. One real world example might be: vlans on top on bond (hybrid port). If bond and vlans have ips assigned and we have their peers monitored via arp_ip_target - in case of switch misconfiguration (trunk/access port), slave driver malfunction or tagged/untagged traffic dropped on the way - we will be able to switch to another slave. Though any other configuration needs that if we need to have access to all arp_ip_targets. This patch adds this possibility by adding a new parameter - arp_all_targets (both as a module parameter and as a sysfs knob). It can be set to: 0 or any (the default) - which works exactly as it's working now - the slave is up if any of the arp_ip_targets are up. 1 or all - the slave is up if all of the arp_ip_targets are up. This parameter can be changed on the fly (via sysfs), and requires the mode to be active-backup and arp_validate to be enabled (it obeys the arp_validate config on which slaves to validate). Internally it's done through: 1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's an array of jiffies, meaning that slave->target_last_arp_rx[i] is the last time we've received arp from bond->params.arp_targets[i] on this slave. 2) If we successfully validate an arp from bond->params.arp_targets[i] in bond_validate_arp() - update the slave->target_last_arp_rx[i] with the current jiffies value. 3) When getting slave's last_rx via slave_last_rx(), we return the oldest time when we've received an arp from any address in bond->params.arp_targets[]. If the value of arp_all_targets == 0 - we still work the same way as before. Also, update the documentation to reflect the new parameter. v3->v4: Kill the forgotten rtnl_unlock(), rephrase the documentation part to be more clear, don't fail setting arp_all_targets if arp_validate is not set - it has no effect anyway but can be easier to set up. Also, print a warning if the last arp_ip_target is removed while the arp_interval is on, but not the arp_validate. v2->v3: Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(), use the same initialization value for target_last_arp_rx[] as is used for the default last_arp_rx, to avoid useless interface flaps. Also, instead of failing to remove the last arp_ip_target just print a warning - otherwise it might break existing scripts. v1->v2: Correctly handle adding/removing hosts in arp_ip_target - we need to shift/initialize all slave's target_last_arp_rx. Also, don't fail module loading on arp_all_targets misconfiguration, just disable it, and some minor style fixes. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: don't trust arp requests unless active slave really worksVeaceslav Falico2013-06-251-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if we receive any arp packet on a backup slave in active-backup mode and arp_validate enabled, we suppose that it's an arp request, swap source/target ip and try to validate it. This optimization gives us virtually no downtime in the most common situation (active and backup slaves are in the same broadcast domain and the active slave failed). However, if we can't reach the arp_ip_target(s), we end up in an endless loop of reselecting slaves, because we receive our arp requests, sent by the active slave, and think that backup slaves are up, thus selecting them as active and, again, sending arp requests, which fool our backup slaves. Fix this by not validating the swapped arp packets if the current active slave didn't receive any arp reply after it was selected as active. This way we will only accept arp requests if we know that the current active slave can actually reach arp_ip_target. v3->v4: Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion. v1->v3: No change. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: don't validate arp if we don't have toVeaceslav Falico2013-06-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Currently, we validate all the incoming arps if arp_validate not 0. However, we don't have to validate backup slaves if arp_validate == active and vice versa, so return early in bond_arp_rcv() in these cases. It works correctly now because we verify arp_validate in slave_last_rx(), however we're just doing useless work in bond_arp_rcv(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: don't add duplicate targets to arp_ip_targetVeaceslav Falico2013-06-251-1/+5
| | | | | | | | | | | | | | Print a warning and skip them. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: add helper function bond_get_targets_ip(targets, ip)Veaceslav Falico2013-06-253-45/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add function bond_get_targets_ip(targets, ip) which searches through targets array of ips (arp_targets) and returns the position of first match. If ip == 0, returns the first free slot. On failure to find the ip or free slot, return -1. Use it to verify if the arp we've received is valid and in sysfs. v1->v2: Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)", per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might update 'null' arp_ip_targets' last_rx. Also, address style. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: trivial: make alb use bond_slave_has_mac()Veaceslav Falico2013-06-191-42/+11
| | | | | | | | | | | | | | Also, cleanup bond_alb_handle_active_change() from 2 identical ifs. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-06-192-5/+16
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/wireless/ath/ath9k/Kconfig drivers/net/xen-netback/netback.c net/batman-adv/bat_iv_ogm.c net/wireless/nl80211.c The ath9k Kconfig conflict was a change of a Kconfig option name right next to the deletion of another option. The xen-netback conflict was overlapping changes involving the handling of the notify list in xen_netbk_rx_action(). Batman conflict resolution provided by Antonio Quartulli, basically keep everything in both conflict hunks. The nl80211 conflict is a little more involved. In 'net' we added a dynamic memory allocation to nl80211_dump_wiphy() to fix a race that Linus reported. Meanwhile in 'net-next' the handlers were converted to use pre and post doit handlers which use a flag to determine whether to hold the RTNL mutex around the operation. However, the dump handlers to not use this logic. Instead they have to explicitly do the locking. There were apparent bugs in the conversion of nl80211_dump_wiphy() in that we were not dropping the RTNL mutex in all the return paths, and it seems we very much should be doing so. So I fixed that whilst handling the overlapping changes. To simplify the initial returns, I take the RTNL mutex after we try to allocate 'tb'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: fix igmp_retrans type and two related racesNikolay Aleksandrov2013-06-132-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First the type of igmp_retrans (which is the actual counter of igmp_resend parameter) is changed to u8 to be able to store values up to 255 (as per documentation). There are two races that were hidden there and which are easy to trigger after the previous fix, the first is between bond_resend_igmp_join_requests and bond_change_active_slave where igmp_retrans is set and can be altered by the periodic. The second race condition is between multiple running instances of the periodic (upon execution it can be scheduled again for immediate execution which can cause the counter to go < 0 which in the unsigned case leads to unnecessary igmp retransmissions). Since in bond_change_active_slave bond->lock is held for reading and curr_slave_lock for writing, we use curr_slave_lock for mutual exclusion. We can't drop them as there're cases where RTNL is not held when bond_change_active_slave is called. RCU is unlocked in bond_resend_igmp_join_requests before getting curr_slave_lock since we don't need it there and it's pointless to delay. The decrement is moved inside the "if" block because if we decrement unconditionally there's still a possibility for a race condition although it is much more difficult to hit (many changes have to happen in a very short period in order to trigger) which in the case of 3 parallel running instances of this function and igmp_retrans == 1 (with check bond->igmp_retrans-- > 1) is: f1 passes, doesn't re-schedule, but decrements - igmp_retrans = 0 f2 then passes, doesn't re-schedule, but decrements - igmp_retrans = 255 f3 does the unnecessary retransmissions. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: reset master mac on first enslave failureNikolay Aleksandrov2013-06-131-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the bond device is supposed to get the first slave's MAC address and the first enslavement fails then we need to reset the master's MAC otherwise it will stay the same as the failed slave device. We do it after err_undo_flags since that is the first place where the MAC can be changed and we check if it should've been the first slave and if the bond's MAC was set to it because that err place is used by multiple locations prior to changing the master's MAC address. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: don't call alb_set_slave_mac_addr() while atomicVeaceslav Falico2013-06-171-35/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alb_set_slave_mac_addr() sets the mac address in alb mode via dev_set_mac_address(), which might sleep. It's called from alb_handle_addr_collision_on_attach() in atomic context (under read_lock(bond->lock)), thus triggering a bug. Fix this by moving the lock inside alb_handle_addr_collision_on_attach(). v1->v2: As Nikolay Aleksandrov noticed, we can drop the bond->lock completely. Also, use bond_slave_has_mac(), when possible. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: disallow change of MAC if fail_over_mac enabledJay Vosburgh2013-06-071-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if fail_over_mac is set to active, then attempts to change the MAC of the bond itself silently fail. However, if fail_over_mac is set to follow, changes are permitted. Permitting the bond's MAC to change with fail_over_mac=follow will disrupt the follow functionality, which normally controls the assignment of MAC address to the bond and its slaves, and can cause multiple ports to be assigned the same MAC address. which will interfere with the functioning of the device (where the device here is a virtualization-aware card for s390, qeth). Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: Convert hw addr handling to sync/unsync, support ucast addressesJay Vosburgh2013-06-072-126/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts bonding to use the dev_uc/mc_sync and dev_uc/mc_sync_multiple functions for updating the hardware addresses of bonding slaves. The existing functions to add or remove addresses are removed, and their functionality is replaced with calls to dev_mc_sync or dev_mc_sync_multiple, depending upon the bonding mode. Calls to dev_uc_sync and dev_uc_sync_multiple are also added, so that unicast addresses added to a bond will be properly synced with its slaves. Various functions are renamed to better reflect the new situation, and relevant comments are updated. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: trivial: update the comments to reflect the realityVeaceslav Falico2013-05-281-3/+1
| | | | | | | | | | Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: trivial: remove unused parameter from alb_swap_mac_addr()Veaceslav Falico2013-05-281-4/+4
| | | | | | | | | | | | | | | | After b924551 ("bonding: fix enslaving in alb mode when link down") we don't need the bond parameter in alb_swap_mac_addr(), so remove it. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: pass info struct via netdevice notifierJiri Pirko2013-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: allow xmit hash policy change while bond dev is upNikolay Aleksandrov2013-05-271-9/+1
|/ | | | | | | | | | Since the xmit_hash_policy pointer is always valid and not dependent on anything, we can change it while the bond device is up and running. The only downside would be the out of order packets but that is a small price to pay. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: fix multiple 3ad mode sysfs race conditionsnikolay@redhat.com2013-05-194-10/+24
| | | | | | | | | | | | When bond_3ad_get_active_agg_info() is used in all show_ad_ functions it is not protected against slave manipulation and since it walks over the slaves and uses them, this can easily result in NULL pointer dereference or use of freed memory. Both the new wrapper and the internal function are exported to the bonding as they're needed in different places. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: arp_ip_count and arp_targets can be wrongnikolay@redhat.com2013-05-191-11/+8
| | | | | | | | | | | | | | | | | When getting arp_ip_targets if we encounter a bad IP, arp_ip_count still gets increased and all the targets after the wrong one will not be probed if arp_interval is enabled after that (unless a new IP target is added through sysfs) because of the zero entry, in this case reading arp_ip_target through sysfs will show valid targets even if there's a zero entry. Example: 1.2.3.4,4.5.6.7,blah,5.6.7.8 When retrieving the list from arp_ip_target the output would be: 1.2.3.4,4.5.6.7,5.6.7.8 but there will be a 0 entry between 4.5.6.7 and 5.6.7.8. If arp_interval is enabled after that 5.6.7.8 will never be checked because of that. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: replace %x with %pI4 for IPv4 addressesnikolay@redhat.com2013-05-191-3/+3
| | | | | | | | There're few pr_debug() places that can provide the IPv4 address in dotted decimal format instead which is more helpful. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: fix set mode race conditionsnikolay@redhat.com2013-05-191-0/+4
| | | | | | | | | | Changing the mode without any locking can result in multiple races (e.g. upping a bond, enslaving/releasing). Depending on which race is hit the impact can vary from incosistent bond state to kernel crash. Use RTNL to synchronize the mode setting with the dangerous races. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: allow TSO being set on bonding masterEric Dumazet2013-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | In some situations, we need to disable TSO on bonding slaves. bonding device automatically unset TSO in bond_fix_features(), and performance is not good because : 1) We consume more cpu cycles. 2) GSO segmentation has some bugs leading to out of order TCP packets if this segmentation is done before virtual device. This particular problem will be addressed in a separate patch. This patch allows TSO being set/unset on the bonding master, so that GSO segmentation is done after bonding layer. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michał Mirosław <mirqus@gmail.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-05-011-3/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
| * procfs: new helper - PDE_DATA(inode)Al Viro2013-04-091-3/+1
| | | | | | | | | | | | | | | | | | | | The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-04-301-2/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/net/ethernet/emulex/benet/be.h include/net/tcp.h net/mac802154/mac802154.h Most conflicts were minor overlapping stuff. The be2net driver brought in some fixes that added __vlan_put_tag calls, which in net-next take an additional argument. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: fix locking in enslave failure pathnikolay@redhat.com2013-04-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 3c5913b53fefc9d9e15a2d0f93042766658d9f3f ("bonding: primary_slave & curr_active_slave are not cleaned on enslave failure") I didn't account for the use of curr_active_slave without curr_slave_lock and since there are such users, we should hold bond->lock for writing while setting it to NULL (in the NULL case we don't need the curr_slave_lock). Keeping the bond lock as to avoid the extra release/acquire cycle. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-04-221-28/+73
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/emulex/benet/be_main.c drivers/net/ethernet/intel/igb/igb_main.c drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c include/net/scm.h net/batman-adv/routing.c net/ipv4/tcp_input.c The e{uid,gid} --> {uid,gid} credentials fix conflicted with the cleanup in net-next to now pass cred structs around. The be2net driver had a bug fix in 'net' that overlapped with the VLAN interface changes by Patrick McHardy in net-next. An IGB conflict existed because in 'net' the build_skb() support was reverted, and in 'net-next' there was a comment style fix within that code. Several batman-adv conflicts were resolved by making sure that all calls to batadv_is_my_mac() are changed to have a new bat_priv first argument. Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO rewrite in 'net-next', mostly overlapping changes. Thanks to Stephen Rothwell and Antonio Quartulli for help with several of these merge resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: in bond_mc_swap() bond's mc addr list is walked without locknikolay@redhat.com2013-04-191-0/+4
| | | | | | | | | | | | | | | | | | | | | Use netif_addr_lock_bh() to acquire the appropriate lock before walking. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: disable netpoll on enslave failurenikolay@redhat.com2013-04-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | slave_disable_netpoll() is not called upon enslave failure which would lead to a memory leak. Call slave_disable_netpoll() after err_detach as that's the first error path after enabling netpoll on that slave. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: primary_slave & curr_active_slave are not cleaned on enslave failurenikolay@redhat.com2013-04-191-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On enslave failure primary_slave can point to new_slave which is to be freed, and the same applies to curr_active_slave. So check if this is the case and clean up properly after err_detach because that's the first error code path after they're set. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: vlans don't get deleted on enslave failurenikolay@redhat.com2013-04-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The main problem is with vid refcount which only gets bumped up. Delete the vlans after err_detach as that's the first error path after the vlans are added. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: mc addresses don't get deleted on enslave failurenikolay@redhat.com2013-04-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add bond_mc_list_flush() after err_detach as that's the first error path after the addresses are added. The main issue is the mc addresses' refcount which only gets bumped up. v2: update log message and don't move code unnecessarily Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: fix l23 and l34 load balancing in forwarding pathEric Dumazet2013-04-181-25/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 6b923cb7188d46 (bonding: support for IPv6 transmit hashing) bonding doesn't properly hash traffic in forwarding setups. Vitaly V. Bursov diagnosed that skb_network_header_len() returned 0 in this case. More generally, the transport header might not be in the skb head. Use pskb_may_pull() & skb_header_pointer() to get it right, and use proto_ports_offset() in bond_xmit_hash_policy_l34() to get support for more protocols than TCP and UDP. Reported-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: John Eaglesham <linux@8192.net> Tested-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: IFF_BONDING is not stripped on enslave failurenikolay@redhat.com2013-04-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While enslaving a new device and after IFF_BONDING flag is set, in case of failure it is not stripped from the device's priv_flags while cleaning up, which could lead to other problems. Cleaning at err_close because the flag is set after dev_open(). v2: no change Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: fix netdev event NULL pointer dereferencenikolay@redhat.com2013-04-111-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 471cb5a33dcbd7c529684a2ac7ba4451414ee4a7 ("bonding: remove usage of dev->master") a bug was introduced which causes a NULL pointer dereference. If a bond device is in mode 6 (ALB) and a slave is added it will dereference a NULL pointer in bond_slave_netdev_event(). This is because in bond_enslave we have bond_alb_init_slave() which changes the MAC address of the slave and causes a NETDEV_CHANGEADDR. Then we have in bond_slave_netdev_event(): struct slave *slave = bond_slave_get_rtnl(slave_dev); struct bonding *bond = slave->bond; bond_slave_get_rtnl() dereferences slave_dev->rx_handler_data which at that time is NULL since netdev_rx_handler_register() is called later. This is fixed by checking if slave is NULL before dereferencing it. v2: Comment style changed. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: fix bonding_masters race condition in bond unloadingnikolay@redhat.com2013-04-081-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While the bonding module is unloading, it is considered that after rtnl_link_unregister all bond devices are destroyed but since no synchronization mechanism exists, a new bond device can be created via bonding_masters before unregister_pernet_subsys which would lead to multiple problems (e.g. NULL pointer dereference, wrong RIP, list corruption). This patch fixes the issue by removing any bond devices left in the netns after bonding_masters is removed from sysfs. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Revert "bonding: remove sysfs before removing devices"nikolay@redhat.com2013-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 4de79c737b200492195ebc54a887075327e1ec1d. This patch introduces a new bug which causes access to freed memory. In bond_uninit: list_del(&bond->bond_list); bond_list is linked in bond_net's dev_list which is freed by unregister_pernet_subsys. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bond: add support to read speed and duplex via ethtoolAndy Gospodarek2013-04-191-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the get_settings ethtool op to the bonding driver. This was motivated by users who wanted to get the speed of the bond and compare that against throughput to understand utilization. The behavior before this patch was added was problematic when computing line utilization after trying to get link-speed and throughput via SNMP. Output from ethtool looks like this for a round-robin bond: Settings for bond0: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Speed: 11000Mb/s Duplex: Full Port: Other PHYAD: 0 Transceiver: internal Auto-negotiation: off MDI-X: Unknown Link detected: yes I tested this and verified it works as expected. A test was also done on a version backported to an older kernel and it worked well there. v2: Switch to using ethtool_cmd_speed_set to set speed, added check to SLAVE_IS_OK for each slave in bond, dropped mode-specific calculations as they were not needed, and set port type to 'Other.' v3: Fix useless assignment and checkpatch warning. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: vlan: add protocol argument to packet tagging functionsPatrick McHardy2013-04-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: vlan: prepare for 802.1ad supportPatrick McHardy2013-04-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Make the encapsulation protocol value a property of VLAN devices and change the device lookup functions to take the protocol value into account. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: vlan: prepare for 802.1ad VLAN filtering offloadPatrick McHardy2013-04-191-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Change the rx_{add,kill}_vid callbacks to take a protocol argument in preparation of 802.1ad support. The protocol argument used so far is always htons(ETH_P_8021Q). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*Patrick McHardy2013-04-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the hardware VLAN acceleration features to include "CTAG" to indicate that they only support CTAGs. Follow up patches will introduce 802.1ad server provider tagging (STAGs) and require the distinction for hardware not supporting acclerating both. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-04-071-1/+1
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/nfc/microread/mei.c net/netfilter/nfnetlink_queue_core.c Pull in 'net' to get Eric Biederman's AF_UNIX fix, upon which some cleanups are going to go on-top. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bonding: remove sysfs before removing devicesVeaceslav Falico2013-04-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a race condition if we try to rmmod bonding and simultaneously add a bond master through sysfs. In bonding_exit() we first remove the devices (through rtnl_link_unregister() ) and only after that we remove the sysfs. If we manage to add a device through sysfs after that the devices were removed - we'll end up with that device/sysfs structure and with the module unloaded. Fix this by first removing the sysfs and only after that calling rtnl_link_unregister(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-04-031-2/+1
|\ \ \ | |/ / | | | | | | | | | | | | | | | Pull net into net-next to get the synchronize_net() bug fix in bonding. Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud