talos-op-linux - Talos™ II Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
...
* \|	fm10k: implement prepare_suspend and handle_resume	Jacob Keller	2016-07-20	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement fm10k_prepare_suspend and fm10k_handle_resume functions which abstract around the now existing fm10k_prepare_for_reset and fm10k_handle_reset. The new functions also handle stopping the service task, which is something that the original re-init flow does not need. Every other location that does a suspend/resume type flow is expected to use these functions, because otherwise they may have conflicts with the running watchdog routines. This also has the effect of preventing possible surprise remove events during handling of FLR events and PCIe errors. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: split fm10k_reinit into two functions	Jacob Keller	2016-07-20	1	-5/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are several flows in the driver which perform the similar function of tearing down software and restoring software to recover from certain errors or PCIe events, including: * fm10k_reinit * fm10k_suspend/resume * fm10k_io_error_detected/fm10k_io_resume In addition, we want to implement a .reset_notify() handler as well which will also perform similar function. Rework how the driver codes reset and resume flows by separating out the reinit logic into two functions "fm10k_prepare_for_reset" and "fm10k_handle_reset". This first step will allow us to re-use this functionality in the similar blocks of code instead of re-coding the same sequence of events slightly different. The end result should be more maintainable and correct, fixing several inconsistencies with the work flow. The new functions expect to take the rtnl_lock() themselves, and it does have the unfortunate side effect of having the reinit flow take then release then take the rtnl_lock. However, this minor downside is out weighted by the benefits of code reduction and reducing needless difference between these flows. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: wait for queues to drain if stop_hw() fails once	Jacob Keller	2016-07-20	1	-6/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out that sometimes during a reset the Tx queues will be temporarily stuck longer than .stop_hw() expects. Work around this issue by attempting to .stop_hw() first. If it tails, wait a number of attempts until the Tx queues appear to be drained. After this, attempt stop_hw() again. This ensures that we avoid waiting if we don't need to, such as during the first initialization of a VF, and give the proper amount of time necessary to recover from most situations. It is possible that the hardware is actually stuck. For PFs, this is usually fixed by a datapath reset. Unfortunately the VF cannot request a similar reset for itself. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING	Jacob Keller	2016-07-20	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When stop_hw() routine fails with FM10K_ERR_REQUESTS_PENDING, this indicates that the Tx or Rx queues did not shutdown within the time limit. Print a more suitable message at the dev_info level instead of dev_err. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: prevent multiple threads updating statistics	Jacob Keller	2016-07-20	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also prevent updating stats while the interface is down. If we're already updating stats, just return doing nothing. When we take the device down, block stat updates until we come back up. This ensures that we avoid tearing down rings when we're updating statistics, and prevents updating statistics until we're up. We can't re-use the __FM10K_DOWN for this because it wouldn't prevent multiple threads from accessing statistics. Neither does it prevent the case where we start updating stats and then start going down in another thread. The fm10k_get_stats64 is except from this, because it has a completely different flow which does not suffer from the same issues as fm10k_update_stats might. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: avoid possible null pointer dereference in fm10k_update_stats	Jacob Keller	2016-07-20	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's currently possible for fm10k_update_stats to be called during the window when we go down and the rings are removed. This can result in a null pointer dereference. In fm10k_get_stats64 we work around this by using ACCESS_ONCE and a null pointer check inside the loop. Use this same flow in the fm10k_update_stats to avoid the potential null pointer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set	Jacob Keller	2016-07-20	1	-1/+2
\|/ \| \| \| \| \| \| \| \| \|	Return early from fm10k_down() when we are already down, since that means another thread is either already finished or has started going down, so shouldn't conflict with them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	treewide: replace dev->trans_start update with helper	Florian Westphal	2016-05-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace all trans_start updates with netif_trans_update helper. change was done via spatch: struct net_device *d; @@ - d->trans_start = jiffies + netif_trans_update(d) Compile tested only. Cc: user-mode-linux-devel@lists.sourceforge.net Cc: linux-xtensa@linux-xtensa.org Cc: linux1394-devel@lists.sourceforge.net Cc: linux-rdma@vger.kernel.org Cc: netdev@vger.kernel.org Cc: MPT-FusionLinux.pdl@broadcom.com Cc: linux-scsi@vger.kernel.org Cc: linux-can@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-omap@vger.kernel.org Cc: linux-hams@vger.kernel.org Cc: linux-usb@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: b.a.t.m.a.n@lists.open-mesh.org Cc: linux-bluetooth@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fm10k: protect fm10k_open in fm10k_io_resume with rtnl_lock	Hannes Frederic Sowa	2016-04-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fm10k_open requires rtnl_lock to be held. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Shannon Nelson <shannon.nelson@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fm10k: consistently use Intel(R) for driver names	Jacob Keller	2016-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Update every header file and other locations to consistently use Intel(R) instead of just Intel. Also update copyright year of files which we modified. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: do not disable PCI device in fm10k_io_error_detected	Jacob Keller	2016-04-20	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fm10k_io_error_detected() does not need to call pci_disable_device(). In the cases where the reset needs to occur, the stack flow will result in calling fm10k_remove() which already disables the PCI device. If we leave the pci_disable_device(), we result in a warning about disabling an already disabled device. Many PCI drivers do call pci_disable_device() in their .error_detected() routines, but it does not appear to be required. In addition, these drivers have a check "is_pci_enabled()" call in their remove routines, which is how they chose to handle the duplicate device disable. This seems incorrect, since the PCI device structure is reference counted. It is very possible that the reference count for the PCI device could be greater than 1. In this case, you would remove the PCI device within the error_detected routine, reducing count to 1, then remove it again in the remove function, reducing it to zero. This would result in yet another disable somewhere else failing. Thus, we shouldn't be using is_pci_enabled() to check for this issue. Instead, just remove the extraneous pci_device_disable() found within the error_detected routine. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: correctly handle LPORT_MAP error	Jacob Keller	2016-04-20	1	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, any error responses from the switch manager after an LPORT_MAP request are silently ignored. At most the mailbox message will be reported as an error. This can result in unexpected behavior when the switch manager has configured a port with zero bandwidth. Add support for reading the fm10k_swapi_error structure from LPORT_MAP responses. If the message contains the necessary TLV and has a non-zero error code, report link down, clear the dglort_map, and delay the next get_host_state call by a reasonable delay. Also log an error message indicating that the LPORT_MAP request failed. The delay ensures preventing an interrupt storm on the switch manager, and reduces the number of mailbox messages we send in this scenario drastically. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: drop 1588 support	Jacob Keller	2016-04-20	1	-109/+1
\| \| \| \| \| \| \| \| \| \| \|	The 1588 support within fm10k does not work correctly with the current version of the switch management software, and likely never worked correctly to begin with. Remove support for PTP/1588. Update copyright year for all these files while we're touching them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: prevent RCU issues during AER events	Jacob Keller	2016-04-20	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	During an AER action response, we were calling fm10k_close without holding the rtnl_lock() which could lead to possible RCU warnings being produced due to 64bit stat updates among other causes. Similarly, we need rtnl_lock() around fm10k_open during fm10k_io_resume. Follow the same pattern elsewhere in the driver and protect the entire open/close sequence. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: fix a minor typo in some comments	Jacob Keller	2016-04-05	1	-3/+3
\| \| \| \| \| \| \| \| \|	s/funciton/function to resolve a typo, and cleanup grammar on a few comments regarding processing the VF mailboxes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: free MBX IRQ before clearing interrupt scheme	Jacob Keller	2016-04-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	During fm10k_io_error_detected we were clearing the interrupt scheme before we freed the MBX IRQ. This causes a kernel panic because the MBX IRQ are assigned after MSI-X initialization. Clearing the interrupt scheme results in removing the MSI-X entry table. Fix this by freeing the MBX IRQ before we clear the interrupt scheme, as we do elsewhere in the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: print error message when stop_hw fails	Jacob Keller	2016-04-05	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	fm10k_stop_hw_generic calls fm10k_disable_queues_generic, which may return an error code indicating that the queues were not stopped within the time limit. Notify the user by displaying a message in the kernel message ring, in a similar way to how we notify the user when reset_hw fails. There isn't much we can do to recover from this error, so currently nothing else is done. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: don't initialize service task until later in probe	Jacob Keller	2016-04-05	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Delay initialization of the service timer and service task until late probe. If we don't wait, failures in probe do not properly cleanup the service timer or service task items, which results in the kernel panic below, potentially freezing the whole system. In addition, ensure that the SERVICE_DISABLE bit is set before we request the MBX IRQ since the MBX interrupt attempts to schedule the service task otherwise. This prevents a similar trace from occurring after this change. We didn't notice this issue before because probe almost always completes successfully. I discovered it due to a mis-ordered mailbox handler array, which resulted in the following failure when requesting mailbox interrupt. [ 555.325619] ------------[ cut here ]------------ [ 555.325628] WARNING: CPU: 0 PID: 4941 at lib/list_debug.c:33 __list_add+0xa0/0xd0() [ 555.325631] list_add corruption. prev->next should be next (ffffffff81f46648), but was (null). (prev=ffff8807fad5d0e8). <snip> [ 555.325722] CPU: 0 PID: 4941 Comm: insmod Tainted: G OE 4.0.4-303.fc22.x86_64 #1 [ 555.325725] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014 [ 555.325727] 0000000000000000 00000000b4f161b3 ffff88081a21f8e8 ffffffff81783124 [ 555.325734] 0000000000000000 ffff88081a21f940 ffff88081a21f928 ffffffff8109c66a [ 555.325740] 0000000064000000 ffff8807fad5d0e8 ffff8807fad5d0e8 ffffffff81f46648 [ 555.325746] Call Trace: [ 555.325752] [<ffffffff81783124>] dump_stack+0x45/0x57 [ 555.325757] [<ffffffff8109c66a>] warn_slowpath_common+0x8a/0xc0 [ 555.325759] [<ffffffff8109c6f5>] warn_slowpath_fmt+0x55/0x70 [ 555.325763] [<ffffffff813ba270>] __list_add+0xa0/0xd0 [ 555.325768] [<ffffffff81102d1d>] __internal_add_timer+0x9d/0x110 [ 555.325771] [<ffffffff81102dbf>] internal_add_timer+0x2f/0xc0 [ 555.325774] [<ffffffff81104e5a>] mod_timer+0x12a/0x230 [ 555.325782] [<ffffffffa03d54ca>] fm10k_probe+0x69a/0xc80 [fm10k] [ 555.325787] [<ffffffff813e8355>] local_pci_probe+0x45/0xa0 [ 555.325791] [<ffffffff8129cf42>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0 [ 555.325794] [<ffffffff813e96b9>] pci_device_probe+0xf9/0x150 [ 555.325799] [<ffffffff814d7e73>] driver_probe_device+0xa3/0x400 [ 555.325802] [<ffffffff814d82ab>] __driver_attach+0x9b/0xa0 [ 555.325805] [<ffffffff814d8210>] ? __device_attach+0x40/0x40 [ 555.325808] [<ffffffff814d5bd3>] bus_for_each_dev+0x73/0xc0 [ 555.325811] [<ffffffff814d78ce>] driver_attach+0x1e/0x20 [ 555.325815] [<ffffffff814d7480>] bus_add_driver+0x180/0x250 [ 555.325819] [<ffffffffa03b2000>] ? 0xffffffffa03b2000 [ 555.325823] [<ffffffff814d8aa4>] driver_register+0x64/0xf0 [ 555.325826] [<ffffffff813e7bec>] __pci_register_driver+0x4c/0x50 [ 555.325832] [<ffffffffa03d6ca3>] fm10k_register_pci_driver+0x23/0x30 [fm10k] [ 555.325838] [<ffffffffa03b2080>] fm10k_init_module+0x80/0x1000 [fm10k] [ 555.325843] [<ffffffff81002128>] do_one_initcall+0xb8/0x200 [ 555.325848] [<ffffffff811e10d2>] ? __vunmap+0xa2/0x100 [ 555.325852] [<ffffffff811fe239>] ? kmem_cache_alloc_trace+0x1b9/0x240 [ 555.325855] [<ffffffff8178230e>] ? do_init_module+0x28/0x1cb [ 555.325858] [<ffffffff81782346>] do_init_module+0x60/0x1cb [ 555.325862] [<ffffffff8112168e>] load_module+0x205e/0x26b0 [ 555.325866] [<ffffffff8111d110>] ? store_uevent+0x70/0x70 [ 555.325870] [<ffffffff812234b0>] ? kernel_read+0x50/0x80 [ 555.325873] [<ffffffff81121f3e>] SyS_finit_module+0xbe/0xf0 [ 555.325878] [<ffffffff81789749>] system_call_fastpath+0x12/0x17 [ 555.325880] ---[ end trace 9e0f58d071eafd2a ]--- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: prevent null pointer dereference of msix_entries table	Jacob Keller	2016-04-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	According to the C standard dereferencing a variable before it is checked invokes undefined behavior, and thus compilers are free to assume the check for NULL isn't necessary. Prevent this by re-ordering the NULL check of msix_entries in fm10k_free_mbx_irq. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use ether_addr_copy to copy MAC address	Bruce Allan	2016-04-05	1	-2/+2
\| \| \| \| \| \| \| \| \|	Cleanup the remaining instances of using memcpy() instead of the preferred ether_addr_copy(). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: demote BUG_ON() to WARN_ON() where appropriate	Bruce Allan	2016-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \|	We don't need to crash the kernel in this instance so just warn about the condition and play on. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: cleanup remaining right-bit-shifted 1	Bruce Allan	2016-04-05	1	-4/+4
\| \| \| \| \| \| \| \|	Use BIT() macro instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: Move constants to the right of binary operators	Bruce Allan	2016-04-05	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	The semantic patch that makes this change is available in scripts/coccinelle/misc/compare_const_fl.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use true/false for boolean get_host_state	Bruce Allan	2015-12-22	1	-3/+3
\| \| \| \| \| \|	Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use ether_addr_equal instead of memcmp	Jacob Keller	2015-12-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When comparing MAC addresses, use ether_addr_equal instead of memcmp to ETH_ALEN length. Found and replaced using the following sed: sed -e 's/memcmp\x28\(.*\), ETH_ALEN\x29/!ether_addr_equal\x28\1\x29/' Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: Cleanup exception handling for changing queues	Alexander Duyck	2015-12-22	1	-12/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is meant to cleanup the exception handling for the paths where we reset the interrupts and then reconfigure them. In all of these paths we had very different levels of exception handling. I have updated the driver so that all of the paths should result in a similar state if we fail. Specifically the driver will now unload the mailbox interrupt, free the queue vectors and MSI-X, and then detach the interface. In addition for any of the PCIe related resets I have added a check with the hw_ready function to just make sure the registers are in a readable state prior to reopening the interface. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: initialize xps at driver load	Jacob Keller	2015-12-13	1	-2/+16
\| \| \| \| \| \| \| \| \| \|	Similar to ixgbe and i40e, initialize XPS on driver load so that we can take advantage of this kernel feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: cleanup overly long lines	Bruce Allan	2015-12-13	1	-13/+16
\| \| \| \| \| \|	Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: whitespace cleanups	Bruce Allan	2015-12-13	1	-3/+2
\| \| \| \| \| \|	Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: Cleanup exception handling for mailbox interrupt	Alexander Duyck	2015-12-13	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses two issues. First is the fact that the fm10k_mbx_free_irq was assuming msix_entries was valid and that will not always be the case. As such we need to add a check for if it is NULL. Second is the fact that we weren't freeing the IRQ if the mailbox API returned an error on trying to connect. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: consistently refer to VLANs and VLAN IDs	Jacob Keller	2015-12-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of using lowercase vlan, vid, or VID, always use VLAN or VLAN ID in comments when referring to VLANs. The original driver code was consistent, but recent patches have not been as consistent with this naming scheme. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: do not use CamelCase	Jacob Keller	2015-12-13	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	Avoid the use of CamelCase for some variable names that previously slipped through review. Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: TRIVIAL fix typo of hardware	Jacob Keller	2015-12-05	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use macro for default Tx and Rx ITR values	Jacob Keller	2015-12-05	1	-2/+2
\| \| \| \| \| \| \|	Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: Update adaptive ITR algorithm	Jacob Keller	2015-12-05	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing adaptive ITR algorithm is overly restrictive. It throttles incorrectly for various traffic rates, and does not produce good performance. The algorithm now allows for more interrupts per second, and does some calculation to help improve for smaller packet loads. In addition, take into account the new itr_scale from the hardware which indicates how much to scale due to PCIe link speed. Reported-by: Matthew Vick <matthew.vick@intel.com> Reported-by: Alex Duyck <alexander.duyck@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: reinitialize queuing scheme after calling init_hw	Jacob Keller	2015-12-05	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The init_hw function may fail, and in the case of VFs, it might change the number of maximum queues available. Thus, for every flow which checks init_hw, we need to ensure that we clear the queue scheme before, and initialize it after. The fm10k_io_slot_reset path will end up triggering a reset so fm10k_reinit needs this change. The fm10k_io_error_detected and fm10k_io_resume also need to properly clear and reinitialize the queue scheme. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: always check init_hw for errors	Jacob Keller	2015-12-05	1	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A recent change modified init_hw in some flows the function may fail on VF devices. For example, if a VF doesn't yet own its own queues. However, many callers of init_hw didn't bother to check the error code. Other callers checked but only displayed diagnostic messages without actually handling the consequences. Fix this by (a) always returning and preventing the netdevice from going up, and (b) printing the diagnostic in every flow for consistency. This should resolve an issue where VF drivers would attempt to come up before the PF has finished assigning queues. In addition, change the dmesg output to explicitly show the actual function that failed, instead of combining reset_hw and init_hw into a single check, to help for future debugging. Fixes: 1d568b0f6424 ("fm10k: do not assume VF always has 1 queue") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: set netdev features in one location	Jacob Keller	2015-12-05	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Don't change netdev hw_features later in fm10k_probe, instead set all values inside fm10k_alloc_netdev. To do so, we need to know the MAC type (whether it is PF or VF) in order to determine what to do. This helps ensure that all logic regarding features is co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use napi_schedule_irqoff()	Alexander Duyck	2015-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The fm10k_msix_clean_rings function runs from hard interrupt context or with interrupts already disabled in netpoll. It can use napi_schedule_irqoff() instead of napi_schedule() Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: add support for extra debug statistics	Jacob Keller	2015-09-22	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \|	Add a private ethtool flag to enable display of these statistics, which are generally less useful. However, sometimes it can be useful for debugging purposes. The most useful portion is the ability to see what the PF thinks the VF mailboxes look like. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: remove comment about rtnl_lock around mbx operations	Jacob Keller	2015-09-22	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \|	This comment is no longer true due to a couple of mailbox locking refactors, and we now don't actually do any rtnl protected operations directly in the mailbox path. Remove this comment as it is factually incorrect and confusing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: re-enable VF after a full reset on detection of a Malicious event	Jacob Keller	2015-09-15	1	-2/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modify behavior of Malicious Driver Detection events. Presently, the hardware disables the VF queues and re-assigns them to the PF. This causes the VF in question to continuously Tx hang, because it assumes that it can transmit over the queues in question. For transient events, this results in continuous logging of malicious events. New behavior is to reset the LPORT and VF state, so that the VF will have to reset and re-enable itself. This does mean that malicious VFs will possibly be able to continue and attempt malicious events again. However, it is expected that system administrators will step in and manually remove or disable the VF in question. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: send traffic on default VID to VLAN device if we have one	Jacob Keller	2015-09-15	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch ensures that VLAN traffic on the default VID will go to the corresponding VLAN device if it exists. To do this, mask the rx_ring VID if we have an active VLAN on that VID. For this to work correctly, we need to update fm10k_process_skb_fields to correctly mask off the VLAN_PRIO_MASK bits and compare them separately, otherwise we incorrectly compare the priority bits with the cleared flag. This also happens to fix a related bug where having priority bits set causes us to incorrectly classify traffic. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: Report MAC address on driver load	Alexander Duyck	2015-09-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This change adds the MAC address to the list of values recorded on driver load. The MAC address represents the serial number of the unit and allows us to track the value should a card be replaced in a system. The log message should now be similar in output to that of ixgbe. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: update netdev perm_addr during reinit, instead of at up	Jacob Keller	2015-09-15	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	Update the netdev permanent address during fm10k_reinit enables the user to immediately see the new MAC address on the VF even if the device isn't up. The previous code required that the device by opened before changes would appear. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: update fm10k_slot_warn to use pcie_get_minimum link	Jacob Keller	2015-09-15	1	-29/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is useful in cases where we connect to a slot at Gen3, but the slot is behind a bus which only connected at Gen2. This generally only happens when a PCIe switch is in the sequence of devices, and can be very confusing when you see slow performance with no obvious cause. I am aware this patch has a few lines that break 80 characters, but there does not seem to be a readable way to format them to less than 80 characters. Suggestions welcome. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: disable service task during suspend	Jacob Keller	2015-09-15	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The service task reads some registers as part of its normal routine, even while the interface is down. Normally this is ok. However, during suspend we have disabled the PCI device. Due to this, registers will read in the same way as a surprise-remove event. Disable the service task while we suspend, and re-enable it after we resume. If we don't do this, the device could be UP when you suspend and come back from resume as closed (since fm10k closes the device when it gets a surprise remove). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: use dma_set_mask_and_coherent in fm10k_probe	Jacob Keller	2015-06-17	1	-18/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch cleans up the use of dma_get_required_mask and uses the simpler dma_set_mask_and_coherent function instead of doing these as separate steps. I removed the dma_get_required_mask call because based on some minimal testing it appears that either (a) we're not doing the right thing with the call or (b) we don't need it anyways. If the value returned is <48bits, we'll end up trying with 48 bits anyways. If it's over 48bits, fm10k can't support that anyways, and we should try 48bits. If 48bits fails, we'll fallback to 32bits. This cleans up some very funky code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: trivial fixup message style to include a colon	Jacob Keller	2015-06-17	1	-1/+1
\| \| \| \| \| \| \| \|	Also use %d for error values, since printing in hexadecimal is probably not helpful. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	fm10k: add call to fm10k_clean_all_rx_rings in fm10k_down	Jacob Keller	2015-06-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents a memory leak in fm10k_set_ringparams. The leak occurs because we go down, change ring parameters, and then come up. However, fm10k_down on its own is not clearing the Rx rings. Since fm10k_up assumes the rings are clean we basically drop the buffers and leak a bunch of memory. Eventually we hit dirty page faults and reboot the system. This issue does not occur elsewhere because other flows that involve fm10k_down go through fm10k_close which immediately called fm10k_free_all_rx_resources which properly cleans the rings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>