talos-op-linux - Talos™ II Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
...
* \| \| \|	ixgbe: process the Rx ipsec offload	Shannon Nelson	2018-01-23	3	-2/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the chip sees and decrypts an ipsec offload, set up the skb sp pointer with the ralated SA info. Since the chip is rude enough to keep to itself the table index it used for the decryption, we have to do our own table lookup, using the hash for speed. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: restore offloaded SAs after a reset	Shannon Nelson	2018-01-23	3	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a chip reset most of the table contents are lost, so must be restored. This scans the driver's ipsec tables and restores both the filled and empty table slots to their pre-reset values. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: add ipsec offload add and remove SA	Shannon Nelson	2018-01-23	3	-1/+399
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the functions for setting up and removing offloaded SAs (Security Associations) with the x540 hardware. We set up the callback structure but we don't yet set the hardware feature bit to be sure the XFRM service won't actually try to use us for an offload yet. The software tables are made up to mimic the hardware tables to make it easier to track what's in the hardware, and the SA table index is used for the XFRM offload handle. However, there is a hashing field in the Rx SA tracking that will be used to facilitate faster table searches in the Rx fast path. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: add ipsec data structures	Shannon Nelson	2018-01-23	2	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set up the data structures to be used by the ipsec offload. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: add ipsec engine start and stop routines	Shannon Nelson	2018-01-23	1	-0/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add in the code for running and stopping the hardware ipsec encryption/decryption engine. It is good to keep the engine off when not in use in order to save on the power draw. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: add ipsec register access routines	Shannon Nelson	2018-01-23	5	-0/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a few routines to make access to the ipsec registers just a little easier, and throw in the beginnings of an initialization. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	ixgbe: clean up ipsec defines	Shannon Nelson	2018-01-23	1	-13/+7
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up the ipsec/macsec descriptor bit definitions to match the rest of the defines and file organization. Also recognise the bit-definition overlap in the error mask macro. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2018-01-19	1	-7/+2
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The BPF verifier conflict was some minor contextual issue. The TUN conflict was less trivial. Cong Wang fixed a memory leak of tfile->tx_array in 'net'. This is an skb_array. But meanwhile in net-next tun changed tfile->tx_arry into tfile->tx_ring which is a ptr_ring. Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	fm10k: mark PM functions as __maybe_unused	Arnd Bergmann	2018-01-18	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A cleanup of the PM code left an incorrect #ifdef in place, leading to a harmless build warning: drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2502:12: error: 'fm10k_suspend' defined but not used [-Werror=unused-function] drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2475:12: error: 'fm10k_resume' defined but not used [-Werror=unused-function] It's easier to use __maybe_unused attributes here, since you can't pick the wrong one. Fixes: 8249c47c6ba4 ("fm10k: use generic PM hooks instead of legacy PCIe power hooks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	ixgbevf: Fix kernel-doc format warnings	Tony Nguyen	2018-01-12	2	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent checks added for formatting kernel-doc comments are causing warnings if W= is run with a non-zero value. This patch fixes function comments to resolve warnings when W=1 is used. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Fix kernel-doc format warnings	Tony Nguyen	2018-01-12	12	-39/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent checks added for formatting kernel-doc comments are causing warnings if W= is run with a non-zero value. This patch fixes function comments to resolve warnings when W=1 is used. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Fix handling of macvlan Tx offload	Alexander Duyck	2018-01-12	2	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This update makes it so that we report the actual number of Tx queues via real_num_tx_queues but are still restricted to RSS on only the first pool by setting num_tc equal to 1. Doing this locks us into only having the ability to setup XPS on the queues in that pool, and only those queues should be used for transmitting anything other than macvlan traffic. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: avoid bringing rings up/down as macvlans are added/removed	Alexander Duyck	2018-01-12	2	-58/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes it so that instead of bringing rings up/down for various we just update the netdev pointer for the Rx ring and set or clear the MAC filter for the interface. By doing it this way we can avoid a number of races and issues in the code as things were getting messy with the macvlan clean-up racing with the interface clean-up to bring the rings down on shutdown. With this change we opt to leave the rings owned by the PF interface for both Tx and Rx and just direct the packets once they are received to the macvlan netdev. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Do not manipulate macvlan Tx queues when performing macvlan offload	Alexander Duyck	2018-01-12	2	-108/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should not be stopping/starting the upper devices Tx queues when handling a macvlan offload. Instead we should be stopping and starting traffic on our own queues. In order to prevent us from doing this I am updating the code so that we no longer change the queue configuration on the upper device, nor do we update the queue_index on our own device. Instead we can just use the queue index for our local device and not update the netdev in the case of the transmit rings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe/fm10k: Record macvlan stats instead of Rx queue for macvlan offloaded ↵	Alexander Duyck	2018-01-12	2	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rings We shouldn't be recording the Rx queue on macvlan offloaded frames since the macvlan is normally brought up as a single queue device, and it will trigger warnings for RPS if we have recorded queue IDs larger than the "real_num_rx_queues" value recorded for the device. Instead we should be recording the macvlan statistics since we are bypassing the normal macvlan statistics that would have been generated by the receive path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Don't assume dev->num_tc is equal to hardware TC config	Alexander Duyck	2018-01-12	6	-27/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code throughout ixgbe was assuming that dev->num_tc was populated and configured with the driver, when in fact this can be configured via mqprio without any hardware coordination other than restricting us to the real number of Tx queues we advertise. Instead of handling things this way we need to keep a local copy of the number of TCs in use so that we don't accidentally pull in the TC configuration from mqprio when it is configured in software mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Default to 1 pool always being allocated	Alexander Duyck	2018-01-12	2	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We might as well configure the limit to default to 1 pool always for the interface. This accounts for the fact that the PF counts as 1 pool if SR-IOV is enabled, and in general we are always running in 1 pool mode when RSS or DCB is enabled as well, though we don't need to actually evaluate any of the VMDq features in those cases. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	ixgbe: Assume provided MAC filter has been verified by macvlan	Alexander Duyck	2018-01-12	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The macvlan driver itself will validate the MAC address that is configured for a given interface. There is no need for us to verify it again. Instead we should be checking to verify that we actually allocate the filter and have not run out of resources to configure a MAC rule in our filter table. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: track id can be 0	Jingjing Wu	2018-01-10	2	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	track_id == 0 is valid for “read only” profiles when profile does not have any “write” commands. Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: change ppp name to ddp	Jingjing Wu	2018-01-10	8	-69/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPP name was going to be confusing since PPP already means point to point protocol. It is decided to change pipeline personalization profile(ppp) to dynamic device personalization(ddp). Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: Drop i40evf_fire_sw_int as it is prone to races	Alexander Duyck	2018-01-10	1	-34/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Having the interrupts firing while we are polling causes extra overhead and isn't needed for most systems out there. If an interrupt is lost us experiencing a 2s latency spike before recovering is still not acceptable and masks the issue. We are better off just identifying systems that lose interrupts and instead enable workarounds for those systems. To that end I am dropping the code that was strobing the interrupts as there is a narrow window where having them enabled can actually cause race issues anyway where a few stray packets might get misses if the interrupt is re-enabled and fires before we call napi_complete. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: Clean-up flags for promisc mode to avoid high polling rate	Alexander Duyck	2018-01-10	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If you enabled and disabled promiscuous mode on a VF you could easily put it into a state where it would start firing interrupts on all queues at a rate of 50+ interrupts per second even though there was no traffic present. The issue seems to have been a stray admin queue feature flag set that was leaving us in a high polling rate for the adminq task. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: Do not clear MSI-X PBA manually	Alexander Duyck	2018-01-10	1	-13/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should not be clearing the pending bit array for each vector manually. The documentation for the hardware states that when in MSI-X mode the pending bit array will be cleared automatically. Us clearing it ourselves just results in multiple opportunities for us to drop an interrupt. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: remove redundant initialization of read_size	Colin Ian King	2018-01-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Variable read_size is initialized and this value is never read, it is instead set inside the do-loop, hence the initialization is redundant and can be removed. Cleans up clang warning: drivers/net/ethernet/intel/i40e/i40e_nvm.c:390:6: warning: Value stored to 'read_size' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e/i40evf: Bump driver versions	Alice Michael	2018-01-10	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bump the i40e driver from 2.1.14 to 2.3.2. Bump the i40evf driver from 3.0.1 to 3.2.2 Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: add helper conversion function for link_speed	Jacob Keller	2018-01-10	2	-2/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We introduced the virtchnl interface in order to have an interface for talking to a virtual device driver which was host-driver agnostic. This interface has its own definitions, including one for link speed. The host driver has to talk to the virtchnl interface using these new definitions in order to remain compatible. Today, the i40e link_speed enumerations are value-exact matches for the virtchnl interface, so it was originally decided to simply use a typecast. However, this is unsafe, and makes it easier for future drivers to continue this unsafe practice. There is nothing guaranteeing these values are exact, and the type-cast would hide any compiler warning which indicates the problem. Rather than rely on this type cast, introduce a helper function which can convert the AdminQ link speed definition into a virtchnl definition. This can then be used by host driver implementations in order to safely convert to the interface recognized by the virtual functions. If the link speed is not able to be represented by the virtchnl definitions we'll report UNKNOWN which is the safest result. This will ensure that should the driver specific link_speeds actual bit definitions change, we do not report them incorrectly according to the VF. Additionally, this provides a better pattern for future drivers to copy, as it is more likely a future device may not use the exact same bit-wise definition as the current virtchnl interface. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: update VFs of link state after GET_VF_RESOURCES	Jacob Keller	2018-01-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently notify a VF of the link state after ENABLE_QUEUES, which is the last thing a VF does after being configured. Guests may not actually ENABLE_QUEUES until they get configured, and thus between driver load and device configuration the VF may show inaccurate link status. Fix this by also sending the link state after GET_VF_RESOURCES. Although we could remove the message following ENABLE_QUEUES, it's not that significant of a loss, so this patch just keeps both to ensure maximum compatibility with guests on various OSes. Specifically, without this patch guests running FreeBSD will display inaccurate link state until the device is brought up. This is mostly a cosmetic issue but can be confusing to system administrators. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: hold the critical task bit lock while opening	Jacob Keller	2018-01-10	1	-9/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If i40evf_open() is called quickly at the same time as a reset occurs (such as via ethtool) it is possible for the device to attempt to open while a reset is in progress. This occurs because the driver was not holding the critical task bit lock during i40evf_open, nor was it holding it around the call to i40evf_up_complete() in i40evf_reset_task(). We didn't hold the lock previously because calls to i40evf_down() would take the bit lock directly, and this would have caused a deadlock. To avoid this, we'll move the bit lock handling out of i40evf_down() and into the callers of this function. Additionally, we'll now hold the bit lock over the entire set of steps when going up or down, to ensure that we remain consistent. Ultimately this causes us to serialize the transitions between down and up properly, and avoid changing status while we're resetting. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: release bit locks in reverse order	Jacob Keller	2018-01-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although not strictly necessary, it is customary to reverse the order in which we release locks that we acquire. This helps preserve lock ordering during future refactors, which can help avoid potential deadlock situations. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: use spinlock to protect (mac\|vlan)_filter_list	Jacob Keller	2018-01-10	3	-49/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stop overloading the __I40EVF_IN_CRITICAL_TASK bit lock to protect the mac_filter_list and vlan_filter_list. Instead, implement a spinlock to protect these two lists, similar to how we protect the hash in the i40e PF code. Ensure that every place where we access the list uses the spinlock to ensure consistency, and stop holding the critical section around blocks of code which only need access to the macvlan filter lists. This refactor helps simplify the locking behavior, and is necessary as a future refactor to the __I40EVF_IN_CRITICAL_TASK would cause a deadlock otherwise. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40evf: don't rely on netif_running() outside rtnl_lock()	Jacob Keller	2018-01-10	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In i40evf_reset_task we use netif_running() to determine whether or not the device is currently up. This allows us to properly free queue memory and shut down things before we request the hardware reset. It turns out that we cannot be guaranteed of netif_running() returning false until the device is fully up, as the kernel core code sets __LINK_STATE_START prior to calling .ndo_open. Since we're not holding the rtnl_lock(), it's possible that the driver's i40evf_open handler function is currently being called while we're resetting. We can't simply hold the rtnl_lock() while checking netif_running() as this could cause a deadlock with the i40evf_open() function. Additionally, we can't avoid the deadlock by holding the rtnl_lock() over the whole reset path, as this essentially serializes all resets, and can cause massive delays if we have multiple VFs on a system. Instead, lets just check our own internal state __I40EVF_RUNNING state field. This allows us to ensure that the state is correct and is only set after we've finished bringing the device up. Without this change we might free data structures about device queues and other memory before they've been fully allocated. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	i40e: display priority_xon and priority_xoff stats	Alice Michael	2018-01-10	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Display some more stats that were already being counted, to help users understand when priority xon/xoff packets are being sent/received Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \|	Merge branch '10GbE' of ↵	David S. Miller	2018-01-10	10	-177/+298
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-09 This series contains updates to ixgbe and ixgbevf only. Emil fixes an issue with "wake on LAN"(WoL) where we need to ensure we enable the reception of multicast packets so that WoL works for IPv6 magic packets. Cleaned up code no longer needed with the update to adaptive ITR. Paul update the driver to advertise the highest capable link speed when a module gets inserted. Also extended the displaying of firmware version to include the iSCSI and OEM block in the EEPROM to better identify firmware versions/images. Tonghao Zhang cleans up a code comment that no longer applies since InterruptThrottleRate has been removed from the driver. Alex fixes SR-IOV and MACVLAN offload interaction, where the MACVLAN offload was incorrectly configuring several filters with the wrong pool value which resulted in MACLVAN interfaces not being able to receive traffic that had to pass over the physical interface. Fixed transmit hangs and dropped receive frames when the number of VFs changed. Added support for RSS on MACVLAN pools for X550 devices. Fixed up the MACVLAN limitations so we can now support 63 offloaded devices. Cleaned up MACVLAN code that is no longer needed with the recent changes and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	ixgbe: Drop l2_accel_priv data pointer from ring struct	Alexander Duyck	2018-01-09	2	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The l2 acceleration private pointer isn't needed in the ring struct. It isn't really used anywhere other than to test and see if we are supporting an offloaded macvlan netdev, and it is much easier to test netdev for not being ixgbe based to verify that. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Use ring values to test for Tx pending	Alexander Duyck	2018-01-09	1	-16/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch simplifies the check for Tx pending traffic and makes it more holistic as there being any difference between next_to_use and next_to_clean is much more informative than if head and tail are equal, as it is possible for us to either not update tail, or not be notified of completed work in which case next_to_clean would not be equal to head. In addition the simplification makes it so that we don't have to read hardware which allows us to drop a number of variables that were previously being used in the call. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Fix limitations on macvlan so we can support up to 63 offloaded devices	Alexander Duyck	2018-01-09	4	-43/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change is a fix of the macvlan offload so that we correctly handle macvlan offloaded devices. Specifically we were configuring our limits based on the assumption that we were going to max out the RSS indices for every mode. As a result when we went to 15 or more macvlan interfaces we were forced into the 2 queue RSS mode on VFs even though they could have still supported 4. This change splits the logic up so that we limit either the total number of macvlan instances if DCB is enabled, or limit the number of RSS queues used per macvlan (instead of per pool) if SR-IOV is enabled. By doing this we can make best use of the part. In addition I have increased the maximum number of supported interfaces to 63 with one queue per offloaded interface as this more closely reflects the actual values supported by the interface. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: There is no need to update num_rx_pools in L2 fwd offload	Alexander Duyck	2018-01-09	2	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The num_rx_pools value is overwritten when we reinitialize the queue configuration. In reality we shouldn't need to be updating the value since it is redone every time we call into ixgbe_setup_tc so for now just drop the spots where we were incrementing or decrementing the value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Add support for macvlan offload RSS on X550 and clean-up pool handling	Alexander Duyck	2018-01-09	1	-37/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order for RSS to work on the macvlan pools of the X550 we need to populate the MRQC, RETA, and RSS key values for each pool. This patch makes it so that we now take care of that. In addition I have dropped the macvlan specific configuration of psrtype since it is redundant with the code that already exists for configuring this value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Perform reinit any time number of VFs change	Alexander Duyck	2018-01-09	1	-16/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the number of VFs are changed we need to reinitialize the part since the offset for the device and the number of pools will be incorrect. Without this change we can end up seeing Tx hangs and dropped Rx frames for incoming traffic. In addition we should drop the code that is arbitrarily changing the default pool and queue configuration. Instead we should wait until the port is reset and reconfigured via ixgbe_sriov_reinit. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Fix interaction between SR-IOV and macvlan offload	Alexander Duyck	2018-01-09	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When SR-IOV was enabled the macvlan offload was configuring several filters with the wrong pool value. This would result in the macvlan interfaces not being able to receive traffic that had to pass over the physical interface. To fix it wrap the pool argument in the VMDQ_P macro which will add the necessary offset to get to the actual VMDq pool Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbevf: remove redundant setting of xcast_mode	Emil Tantilov	2018-01-09	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removed leftover assignment of xcast_mode. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: Remove an obsolete comment about ITR	Tonghao Zhang	2018-01-09	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The InterruptThrottleRate has been removed from ixgbe. Then Update the comment. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: extend firmware version support	Paul Greenwalt	2018-01-09	7	-14/+198
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend FW version reporting by displaying information from the iSCSI or OEM block in the EEPROM. This will allow us to more accurately identify the FW. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: advertise highest capable link speed	Paul Greenwalt	2018-01-09	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On module insert advertise highest capable link speed. If module is capable of 10G, then advertise 10G, else advertise modules capable link speeds. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: remove unused enum latency_range	Emil Tantilov	2018-01-09	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enum is no longer needed after commit: b4ded8327fe ("ixgbe: Update adaptive ITR algorithm") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \| \|	ixgbe: enable multicast on shutdown for WOL	Emil Tantilov	2018-01-09	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we only enabled the reception of multicast packets when wake on multicast is set, but we also need this to allow waking with IPv6 magic packets. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2018-01-09	6	-26/+104
\|\ \ \ \ \| \|/ / / \|/\| / / \| \|/ /
\| * \|	Merge branch '40GbE' of ↵	David S. Miller	2018-01-03	3	-17/+72
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-01-03 This series contains fixes for i40e and i40evf. Amritha removes the UDP support for big buffer cloud filters since it is not supported and having UDP enabled is a bug. Alex fixes a bug in the __i40e_chk_linearize() which did not take into account large (16K or larger) fragments that are split over 2 descriptors, which could result in a transmit hang. Jake fixes an issue where a devices own MAC address could be removed from the unicast address list, so force a check on every address sync to ensure removal does not happen. Jiri Pirko fixes the return value when a filter configuration is not supported, do not return "invalid" but return "not supported" so that the core can react correctly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \|	i40e: flower: Fix return value for unsupported offload	Jiri Pirko	2018-01-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When filter configuration is not supported, drivers should return -EOPNOTSUPP so the core can react correctly. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| \| * \|	i40e: don't remove netdev->dev_addr when syncing uc list	Jacob Keller	2018-01-03	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some circumstances, such as with bridging, it is possible that the stack will add a devices own MAC address to its unicast address list. If, later, the stack deletes this address, then the i40e driver will receive a request to remove this address. The driver stores its current MAC address as part of the MAC/VLAN hash array, since it is convenient and matches exactly how the hardware expects to be told which traffic to receive. This causes a problem, since for more devices, the MAC address is stored separately, and requests to delete a unicast address should not have the ability to remove the filter for the MAC address. Fix this by forcing a check on every address sync to ensure we do not remove the device address. There is a very narrow possibility of a race between .set_mac and .set_rx_mode, if we don't change netdev->dev_addr before updating our internal MAC list in .set_mac. This might be possible if .set_rx_mode is going to remove MAC "XYZ" from the list, at the same time as .set_mac changes our dev_addr to MAC "XYZ", we might possibly queue a delete, then an add in .set_mac, then queue a delete in .set_rx_mode's dev_uc_sync and then update netdev->dev_addr. We can avoid this by moving the copy into dev_addr prior to the changes to the MAC filter list. A similar race on the other side does not cause problems, as if we're changing our MAC form A to B, and we race with .set_rx_mode, it could queue a delete from A, we'd update our address, and allow the delete. This seems like a race, but in reality we're about to queue a delete of A anyways, so it would not cause any issues. A race in the initialization code is unlikely because the netdevice has not yet been fully initialized and the stack should not be adding or removing addresses yet. Note that we don't (yet) need similar code for the VF driver because it does not make use of __dev_uc_sync and __dev_mc_sync, but instead roles its own method for handling updates to the MAC/VLAN list, which already has code to protect against removal of the hardware address. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>