#
c23ae509 |
|
25-Apr-2023 |
John Hickey <jjh@daedalian.us> |
ixgbe: Fix panic during XDP_TX with > 64 CPUs Commit 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus") adds support to allow XDP programs to run on systems with more than 64 CPUs by locking the XDP TX rings and indexing them using cpu % 64 (IXGBE_MAX_XDP_QS). Upon trying this out patch on a system with more than 64 cores, the kernel paniced with an array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the cpu instead of cpu % IXGBE_MAX_XDP_QS. An example splat: ========================================================================== UBSAN: array-index-out-of-bounds in /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26 index 65 is out of range for type 'ixgbe_ring *[64]' ========================================================================== BUG: kernel NULL pointer dereference, address: 0000000000000058 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 65 PID: 408 Comm: ksoftirqd/65 Tainted: G IOE 5.15.0-48-generic #54~20.04.1-Ubuntu Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020 RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe] Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9 00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7 47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0 RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282 RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000 RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000 RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001 R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000 R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c FS: 0000000000000000(0000) GS:ffff92b9dfc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ixgbe_poll+0x103e/0x1280 [ixgbe] ? sched_clock_cpu+0x12/0xe0 __napi_poll+0x30/0x160 net_rx_action+0x11c/0x270 __do_softirq+0xda/0x2ee run_ksoftirqd+0x2f/0x50 smpboot_thread_fn+0xb7/0x150 ? sort_range+0x30/0x30 kthread+0x127/0x150 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x1f/0x30 </TASK> I think this is how it happens: Upon loading the first XDP program on a system with more than 64 CPUs, ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup. However, immediately after this, the rings are reconfigured by ixgbe_setup_tc. ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop. ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if it is non-zero. Commenting out the decrement in ixgbe_free_q_vector stopped my system from panicing. I suspect to make the original patch work, I would need to load an XDP program and then replace it in order to get ixgbe_xdp_locking_key back above 0 since ixgbe_setup_tc is only called when transitioning between XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is incremented every time ixgbe_xdp_setup is called. Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems with more than 64 CPUs. Since ixgbe_xdp_locking_key only protects the XDP_TX path and is tied to the number of CPUs present, there is no reason to disable it upon unloading an XDP program. To avoid confusion, I have moved enabling ixgbe_xdp_locking_key into ixgbe_sw_init, which is part of the probe path. Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus") Signed-off-by: John Hickey <jjh@daedalian.us> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230425170308.2522429-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
b48b89f9 |
|
27-Sep-2022 |
Jakub Kicinski <kuba@kernel.org> |
net: drop the weight argument from netif_napi_add We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
4fe81585 |
|
29-Sep-2021 |
Jason Xing <xingwanli@kuaishou.com> |
ixgbe: let the xdpdrv work with more than 64 cpus Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the server is equipped with more than 64 cpus online. So it turns out that the loading of xdpdrv causes the "NOMEM" failure. Actually, we can adjust the algorithm and then make it work through mapping the current cpu to some xdp ring with the protect of @tx_lock. Here are some numbers before/after applying this patch with xdp-example loaded on the eth0X: As client (tx path): Before After TCP_STREAM send-64 734.14 714.20 TCP_STREAM send-128 1401.91 1395.05 TCP_STREAM send-512 5311.67 5292.84 TCP_STREAM send-1k 9277.40 9356.22 (not stable) TCP_RR send-1 22559.75 21844.22 TCP_RR send-128 23169.54 22725.13 TCP_RR send-512 21670.91 21412.56 As server (rx path): Before After TCP_STREAM send-64 1416.49 1383.12 TCP_STREAM send-128 3141.49 3055.50 TCP_STREAM send-512 9488.73 9487.44 TCP_STREAM send-1k 9491.17 9356.22 (not stable) TCP_RR send-1 23617.74 23601.60 ... Notice: the TCP_RR mode is unstable as the official document explains. I tested many times with different parameters combined through netperf. Though the result is not that accurate, I cannot see much influence on this patch. The static key is places on the hot path, but it actually shouldn't cause a huge regression theoretically. Co-developed-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
27e40255 |
|
05-Mar-2021 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
ixgbe: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding multiple break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
#
5198d545 |
|
09-Sep-2020 |
Jakub Kicinski <kuba@kernel.org> |
net: remove napi_hash_del() from driver-facing API We allow drivers to call napi_hash_del() before calling netif_napi_del() to batch RCU grace periods. This makes the API asymmetric and leaks internal implementation details. Soon we will want the grace period to protect more than just the NAPI hash table. Restructure the API and have drivers call a new function - __netif_napi_del() if they want to take care of RCU waits. Note that only core was checking the return status from napi_hash_del() so the new helper does not report if the NAPI was actually deleted. Some notes on driver oddness: - veth observed the grace period before calling netif_napi_del() but that should not matter - myri10ge observed normal RCU flavor - bnx2x and enic did not actually observe the grace period (unless they did so implicitly) - virtio_net and enic only unhashed Rx NAPIs The last two points seem to indicate that the calls to napi_hash_del() were a left over rather than an optimization. Regardless, it's easy enough to correct them. This patch may introduce extra synchronize_net() calls for interfaces which set NAPI_STATE_NO_BUSY_POLL and depend on free_netdev() to call netif_napi_del(). This seems inevitable since we want to use RCU for netpoll dev->napi_list traversal, and almost no drivers set IFF_DISABLE_NETPOLL. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f140ad9f |
|
09-Jun-2020 |
Ciara Loftus <ciara.loftus@intel.com> |
ixgbe: protect ring accesses with READ- and WRITE_ONCE READ_ONCE should be used when reading rings prior to accessing the statistics pointer. Introduce this as well as the corresponding WRITE_ONCE usage when allocating and freeing the rings, to ensure protected access. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
780e354d |
|
20-Sep-2019 |
Alexander Duyck <alexander.h.duyck@linux.intel.com> |
ixgbe: Make use of cpumask_local_spread to improve RSS locality This patch is meant to address locality issues present in the ixgbe driver when it is loaded on a system supporting multiple NUMA nodes and more CPUs then the device can map in a 1:1 fashion. Instead of just arbitrarily mapping itself to CPUs 0-62 it would make much more sense to map itself to the local CPUs first, and then map itself to any remaining CPUs that might be used. The first effect of this is that queue 0 should always be allocated on the local CPU/NUMA node. This is important as it is the default destination if a packet doesn't match any existing flow director filter or RSS rule and as such having it local should help to reduce QPI cross-talk in the event of an unrecognized traffic type. In addition this should increase the likelihood of the RSS queues being allocated and used on CPUs local to the device while the ATR/Flow Director queues would be able to route traffic directly to the CPU that is likely to be processing it. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
439bb9ed |
|
07-Feb-2019 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
ixgbe: Use struct_size() helper One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = kzalloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); Notice that, in this case, variable size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d0bcacd0 |
|
02-Oct-2018 |
Björn Töpel <bjorn@kernel.org> |
ixgbe: add AF_XDP zero-copy Rx support This patch adds zero-copy Rx support for AF_XDP sockets. Instead of allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain queue. All AF_XDP specific functions are added to a new file, ixgbe_xsk.c. Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS will allocate a new buffer and copy the zero-copy frame prior passing it to the kernel stack. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: William Tu <u9012063@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
646bb57c |
|
04-Jun-2018 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Fix setting of TC configuration for macvlan case When we were enabling macvlan interfaces we weren't correctly configuring things until ixgbe_setup_tc was called a second time either by tweaking the number of queues or increasing the macvlan count past 15. The issue came down to the fact that num_rx_pools is not populated until after the queues and interrupts are reinitialized. Instead of trying to set it sooner we can just move the call to setup at least 1 traffic class to the SR-IOV/VMDq setup function so that we just set it for this one case. We already had a spot that was configuring the queues for TC 0 in the code here anyway so it makes sense to also set the number of TCs here as well. Fixes: 49cfbeb7a95c ("ixgbe: Fix handling of macvlan Tx offload") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
51dce24b |
|
26-Apr-2018 |
Jeff Kirsher <jeffrey.t.kirsher@intel.com> |
net: intel: Cleanup the copyright/license headers After many years of having a ~30 line copyright and license header to our source files, we are finally able to reduce that to one line with the advent of the SPDX identifier. Also caught a few files missing the SPDX license identifier, so fixed them up. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Shannon Nelson <shannon.nelson@oracle.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8f611fb0 |
|
15-Jan-2018 |
Colin Ian King <colin.king@canonical.com> |
ixgbe: remove redundant initialization of 'pool' Variable pool is being assigned zero and then in the following for-loop is it being set to zero again. Remove the redundant first assignment. Cleans up clang warning: drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c:61:2: warning: Value stored to 'pool' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
59259470 |
|
19-Dec-2017 |
Shannon Nelson <shannon.nelson@oracle.com> |
ixgbe: process the Tx ipsec offload If the skb has a security association referenced in the skb, then set up the Tx descriptor with the ipsec offload bits. While we're here, we fix an oddly named field in the context descriptor struct. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
49cfbeb7 |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Fix handling of macvlan Tx offload This update makes it so that we report the actual number of Tx queues via real_num_tx_queues but are still restricted to RSS on only the first pool by setting num_tc equal to 1. Doing this locks us into only having the ability to setup XPS on the queues in that pool, and only those queues should be used for transmitting anything other than macvlan traffic. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b5f69ccf |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: avoid bringing rings up/down as macvlans are added/removed This change makes it so that instead of bringing rings up/down for various we just update the netdev pointer for the Rx ring and set or clear the MAC filter for the interface. By doing it this way we can avoid a number of races and issues in the code as things were getting messy with the macvlan clean-up racing with the interface clean-up to bring the rings down on shutdown. With this change we opt to leave the rings owned by the PF interface for both Tx and Rx and just direct the packets once they are received to the macvlan netdev. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
16be45bc |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Do not manipulate macvlan Tx queues when performing macvlan offload We should not be stopping/starting the upper devices Tx queues when handling a macvlan offload. Instead we should be stopping and starting traffic on our own queues. In order to prevent us from doing this I am updating the code so that we no longer change the queue configuration on the upper device, nor do we update the queue_index on our own device. Instead we can just use the queue index for our local device and not update the netdev in the case of the transmit rings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
0efbf12b |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Don't assume dev->num_tc is equal to hardware TC config The code throughout ixgbe was assuming that dev->num_tc was populated and configured with the driver, when in fact this can be configured via mqprio without any hardware coordination other than restricting us to the real number of Tx queues we advertise. Instead of handling things this way we need to keep a local copy of the number of TCs in use so that we don't accidentally pull in the TC configuration from mqprio when it is configured in software mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
4e039c16 |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Fix limitations on macvlan so we can support up to 63 offloaded devices This change is a fix of the macvlan offload so that we correctly handle macvlan offloaded devices. Specifically we were configuring our limits based on the assumption that we were going to max out the RSS indices for every mode. As a result when we went to 15 or more macvlan interfaces we were forced into the 2 queue RSS mode on VFs even though they could have still supported 4. This change splits the logic up so that we limit either the total number of macvlan instances if DCB is enabled, or limit the number of RSS queues used per macvlan (instead of per pool) if SR-IOV is enabled. By doing this we can make best use of the part. In addition I have increased the maximum number of supported interfaces to 63 with one queue per offloaded interface as this more closely reflects the actual values supported by the interface. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
ff815fb2 |
|
22-Nov-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: There is no need to update num_rx_pools in L2 fwd offload The num_rx_pools value is overwritten when we reinitialize the queue configuration. In reality we shouldn't need to be updating the value since it is redone every time we call into ixgbe_setup_tc so for now just drop the spots where we were incrementing or decrementing the value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b4ded832 |
|
25-Sep-2017 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Update adaptive ITR algorithm The following change is meant to update the adaptive ITR algorithm to better support the needs of the network. Specifically with this change what I have done is make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of Tx, or overrunning an Rx socket buffer on receive. In addition a side effect of the calculations used is that we should function better with new features such as XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
90382dca |
|
17-Jul-2017 |
John Fastabend <john.fastabend@gmail.com> |
ixgbe: NULL xdp_tx rings on resource cleanup tx_rings and rx_rings are cleaned up on close paths in ixgbe driver however, xdp_rings are not. Set the xdp_rings to NULL here so that we can use the pointer to indicate if the XDP rings are initialized. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
33fdc82f |
|
24-Apr-2017 |
John Fastabend <john.r.fastabend@intel.com> |
ixgbe: add support for XDP_TX action A couple design choices were made here. First I use a new ring pointer structure xdp_ring[] in the adapter struct instead of pushing the newly allocated XDP TX rings into the tx_ring[] structure. This means we have to duplicate loops around rings in places we want to initialize both TX rings and XDP rings. But by making it explicit it is obvious when we are using XDP rings and when we are using TX rings. Further we don't have to do ring arithmatic which is error prone. As a proof point for doing this my first patches used only a single ring structure and introduced bugs in FCoE code and macvlan code paths. Second I am aware this is not the most optimized version of this code possible. I want to get baseline support in using the most readable format possible and then once this series is included I will optimize the TX path in another series of patches. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
3ffc1af5 |
|
02-Feb-2017 |
Eric Dumazet <edumazet@google.com> |
ixgbe: get rid of custom busy polling code In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2bf1a87b |
|
04-Nov-2016 |
Emil Tantilov <emil.s.tantilov@intel.com> |
ixgbe: add mask for 64 RSS queues The indirection table was reported incorrectly for X550 and newer where we can support up to 64 RSS queues. Reported-by Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
e24fcf28 |
|
07-Sep-2016 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Support 4 queue RSS on VFs with 1 or 2 queue RSS on PF Instead of limiting the VFs if we don't use 4 queues for RSS in the PF we can instead just limit the RSS queues used to a power of 2. By doing this we can support use cases where VFs are using more queues than the PF is currently using and can support RSS if so desired. The only limitation on this is that we cannot support 3 queues of RSS in the PF or VF. In either of these cases we should fall back to 2 queues in order to be able to use the power of 2 masking provided by the psrtype register. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
49425dfc |
|
01-Apr-2016 |
Mark Rustad <mark.d.rustad@intel.com> |
ixgbe: Add support for x550em_a 10G MAC type Add support for x550em_a 10G MAC type to the ixgbe driver. The new MAC includes new firmware commands that need to be used to control PHY and IOSF access, so that support is also added. The interface supported is a native SFP+ interface. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
93d05d4a |
|
18-Nov-2015 |
Eric Dumazet <edumazet@google.com> |
net: provide generic busy polling to all NAPI drivers NAPI drivers no longer need to observe a particular protocol to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y) napi_hash_add() and napi_hash_del() are automatically called from core networking stack, respectively from netif_napi_add() and netif_napi_del() This patch depends on free_netdev() and netif_napi_del() being called from process context, which seems to be the norm. Drivers might still prefer to call napi_hash_del() on their own, since they might combine all the rcu grace periods into a single one, knowing their NAPI structures lifetime, while core networking stack has no idea of a possible combining. Once this patch proves to not bring serious regressions, we will cleanup drivers to either remove napi_hash_del() or provide appropriate rcu grace periods combining. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8ac34f10 |
|
30-Jul-2015 |
Alexander Duyck <alexander.h.duyck@redhat.com> |
ixgbe: Limit lowest interrupt rate for adaptive interrupt moderation to 12K This patch updates the lowest limit for adaptive interrupt interrupt moderation to roughly 12K interrupts per second. The way I came about reaching 12K as the desired interrupt rate is by testing with UDP flows. Specifically I had a simple test that ran a netperf UDP_STREAM test at varying sizes. What I found was as the packet sizes increased the performance fell steadily behind until we were only able to receive at ~4Gb/s with a message size of 65507. A bit of digging found that we were dropping packets for the socket in the network stack, and looking at things further what I found was I could solve it by increasing the interrupt rate, or increasing the rmem_default/rmem_max. What I found was that when the interrupt coalescing resulted in more data being processed per interrupt than could be stored in the socket buffer we started losing packets and the performance dropped. So I reached 12K based on the following math. rmem_default = 212992 skb->truesize = 2994 212992 / 2994 = 71.14 packets to fill the buffer packet rate at 1514 packet size is 812744pps 71.14 / 812744 = 87.9us to fill socket buffer From there it was just a matter of choosing the interrupt rate and providing a bit of wiggle room which is why I decided to go with 12K interrupts per second as that uses a value of 84us. The data below is based on VM to VM over a direct assigned ixgbe interface. The test run was: netperf -H <ip> -t UDP_STREAM" Socket Message Elapsed Messages CPU Service Size Size Time Okay Errors Throughput Util Demand bytes bytes secs # # 10^6bits/sec % SS us/KB Before: 212992 65507 60.00 1100662 0 9613.4 10.89 0.557 212992 60.00 473474 4135.4 11.27 0.576 After: 212992 65507 60.00 1100413 0 9611.2 10.73 0.549 212992 60.00 974132 8508.3 11.69 0.598 Using bare metal the data is similar but not as dramatic as the throughput increases from about 8.5Gb/s to 9.5Gb/s. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
9a75a1ac |
|
06-Nov-2014 |
Don Skidmore <donald.c.skidmore@intel.com> |
ixgbe: Add new support for X550 MAC's This patch will add in the new MAC defines and fit it into the switch cases throughout the driver. New functionality and enablement support will be added in following patches. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
d786cf7b |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: add warnings for other disabled features without MSI-X support When we can't get MSI-X vectors, we disable a few features which require MSI-X vectors. Print warnings just like we do when disabling DCB. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
5d31b48a |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: use e_dev_warn instead of netif_printk Again, we should not be directly using netif_printk, as we have our own error print routines that we generate. In addition, instead of using an early return we can just use the else block of this one line if statement. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
c1c55f63 |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: use e_dev_warn instead of e_err for displaying warning In this case, disabling DCB is not an error. We can still function, but we just have to let the user know. In addition, since we call this during probe before allocating our netdevice structure, we should use e_dev_warn instead of e_warn. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
3bcf3446 |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: determine vector count inside ixgbe_acquire_msix_vectors Our calculated v_budget doesn't matter except if we allocate MSI-X vectors. We shouldn't need to calculate this outside of the function, so don't. Instead, only calculate it once we attempt to acquire MSI-X vectors. This helps collocate all of the MSI-X vector code together. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
027bb561 |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: move msix_entries allocation into ixgbe_acquire_msix_vectors We already have to kfree this value if we fail, and this is only part of MSI-X mode, so we should simply allocate the value where we need it. This is cleaner, and makes it a lot more obvious why we are freeing it inside of ixgbe_acquire_msix_vectors. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
d7de3c6e |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: return integer from ixgbe_acquire_msix_vectors Similar to how ixgbevf handles acquiring MSI-X vectors, we can return an error code instead of relying on the flag being set. This makes it more clear that we have failed to setup MSI-X mode, and also will make it easier to consolidate MSI-X related code all into the single function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
493043e5 |
|
03-Sep-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: use e_dev_warn instead of netif_printk The netif_printk relies on our netdevice structure to be registered already. We may call ixgbe_acquire_msix_vectors prior to registering our netdevice, so we should not use the netdevice specific printk. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
adc81090 |
|
25-Jul-2014 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Refactor busy poll socket code to address multiple issues This change addresses several issues in the current ixgbe implementation of busy poll sockets. First was the fact that it was possible for frames to be delivered out of order if they were held in GRO. This is addressed by flushing the GRO buffers before releasing the q_vector back to the idle state. The other issue was the fact that we were having to take a spinlock on changing the state to and from idle. To resolve this I have replaced the state value with an atomic and use atomic_cmpxchg to change the value from idle, and a simple atomic set to restore it back to idle after we have acquired it. This allows us to only use a locked operation on acquiring the vector without a need for a locked operation to release it. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
eec66731 |
|
21-Aug-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: add comment noting recalculation of queues Since we previously called ixgbe_set_num_queues just prior to attempting to set our interrupt scheme, it may be non obvious why we have to call it again inside the function. Add a comment which helps make it more obvious that we are resetting features based on the fact that we do not have MSI-X enabled, and cannot use the previous settings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
6ec1b71f |
|
09-Apr-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: fix several concatenated strings to single line This patch fixes various log strings that are split over multiple lines in the ixgbe driver. This cleans up checkpatch.pl warnings, and makes it easier to search the code for warning strings displayed to the kernel log. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b89aae71 |
|
21-Feb-2014 |
Jacob Keller <jacob.e.keller@intel.com> |
ixgbe: add Linux NICS mailing list to contact info This patch updates the contact information on the ixgbe driver files so that every file includes the Linux NICS address, as it is still used, but only a few of the files mentioned it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b45e620c |
|
18-Feb-2014 |
Alexander Gordeev <agordeev@redhat.com> |
ixgbe: Use pci_enable_msix_range() instead of pci_enable_msix() As result of deprecation of MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block() all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() and pci_enable_msix_range() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: e1000-devel@lists.sourceforge.net Cc: netdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2a47fa45 |
|
06-Nov-2013 |
John Fastabend <john.r.fastabend@intel.com> |
ixgbe: enable l2 forwarding acceleration for macvlans Now that l2 acceleration ops are in place from the prior patch, enable ixgbe to take advantage of these operations. Allow it to allocate queues for a macvlan so that when we transmit a frame, we can do the switching in hardware inside the ixgbe card, rather than in software. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5a85e737 |
|
10-Jun-2013 |
Eliezer Tamir <eliezer.tamir@linux.intel.com> |
ixgbe: add support for ndo_ll_poll Add the ixgbe driver code implementing ndo_ll_poll. Adds ndo_ll_poll method and locking between it and the napi poll. When receiving a packet we use skb_mark_ll to record the napi it came from. Add each napi to the napi_hash right after netif_napi_add(). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fd786b7b |
|
11-Jan-2013 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Add function for setting XPS queue mapping This change adds support for ixgbe to configure the XPS queue mapping on load. The result of this change is that on open we will now be resetting the number of Tx queues, and then setting the default configuration for XPS based on if ATR is enabled or disabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
d3cb9869 |
|
15-Jan-2013 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Define FCoE and Flow director limits much sooner to allow for changes Instead of adjusting the FCoE and Flow director limits based on the number of CPUs we can define them much sooner. This allows the user to come through later and adjust them once we have updated the code to support the set_channels ethtool operation. I am still allowing for FCoE and RSS queues to be separated if the number queues is less than the number of CPUs. This essentially treats the two groupings like they are two separate traffic classes. In addition I am changing the initialization to use the MAX_TX/RX_QUEUES defines instead of trying to compute the value as it will be possible in upcoming patches for the user to request the maximum number of queues. I have also updated things so that the upper limit on queues is exactly 63 instead of allowing it to go up to 64. The reason for this change is to address the fact thqt the driver only supports up to 63 queue vectors since the hardware supports 64 MSI-X vectors, but one must be reserved for "other" causes. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
434c5e39 |
|
07-Jan-2013 |
Don Skidmore <donald.c.skidmore@intel.com> |
ixgbe: update date to 2013 Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
3af3361e |
|
24-Oct-2012 |
Emil Tantilov <emil.s.tantilov@intel.com> |
ixgbe: fix default setting of TXDCTL.WTHRESH The q_vector->itr check in ixgbe_configure_tx_ring() was done prior to it being set, which resulted in TXDCTL.WTHRESH always being set to 1 on driver load, while consequent resets would set it to 8. This patch moves the setting of q_vector->itr in ixgbe_alloc_q_vector() to make sure that TXDCTL.WTHRESH is set to 8 by default. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
245f292d |
|
27-Jul-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Initialize q_vector cpu and affinity masks correctly When enabling DCB the rings belonging to a q_vector on CPU 0 were not reinitializing their DCA registers. Upon closer inspection the issue was that the q_vector CPU variable was left at 0 resulting in the driver not updating the DCA registers. In order to guarantee the DCA registers will be updated I am adding a couple line change so that we initialize the CPU variable to -1 which will force a DCA update the first time an interrupt fires on that q_vector. In addition we were setting the CPU affinity hint to all CPUs when we were not specifying a CPU. Instead we should leave it as all zeros to avoid any possible confusion about the fact that we shouldn't be giving a hint. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b724e9f2 |
|
16-Jul-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Use 1TC DCB instead of disabling DCB for MSI and legacy interrupts This change makes it so that we can use 1TC DCB in the case of MSI and legacy interrupts. The advantage to this is that it allows us to fully support FCoE w/ DCB instead of having to drop to link flow control only when using these interrupt modes. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
39cb681b |
|
05-Jun-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Fix handling of FDIR_HASH flag This change makes it so that we can use the atr_sample_rate to determine if we are capable of supporting ATR. The advantage to this approach is that it allows us to now determine the setting of the IXGBE_FLAG_FDIR_HASH_CAPABLE based on the queueing scheme, instead of the queueing scheme being based on the flag. Using this approach there are essentially 5 conditions that must be checked prior to trying to enable ATR: 1. Is SR-IOV disabled? 2. Are the number of TCs <= 1? 3. Is RSS queueing limit greater than 1? 4. Is atr_sample_rate set? 5. Is Flow Director perfect filtering disabled? If any of these conditions are enabled they should disable ATR filtering. Note that in the case of conditions 1 through 4 being met we will set things up for ATR queueing, however if test 5 fails we will still leave the queues allocated for use by perfect filters. The reason for this is to allow for us to switch back and forth between ntuple and ATR without needing to reallocate the descriptor rings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
99d74487 |
|
09-May-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Drop probe_vf and merge functionality into ixgbe_enable_sriov This is meant to fix a bug in which we were not checking for pre-existing VFs if we were not setting the max_vfs value at driver load. What happens now is that we always call ixgbe_enable_sriov and this checks for pre-existing VFs ore requested VFs prior to deciding on no SR-IOV. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
fbe7ca7f |
|
13-Jul-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Retire RSS enabled and capable flags All of our hardware supports RSS even if it is only for a single queue. So instead of toting around the RSS enable flag I am updating the code so that all devices are enabled and if we want to disable RSS it is indicated via the RSS mask. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
73079ea0 |
|
14-Jul-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Add support for SR-IOV w/ DCB or RSS This change essentially makes it so that we can enable almost all of the features all at once. This patch allows for the combination of SR-IOV, DCB, and FCoE in the case of the x540. It also beefs up the SR-IOV by adding support for RSS to the PF. The testing matrix gets to be very complex for this patch as there are a number of different features and subsets for queueing options. I tried to narrow these down a bit by restricting the PF to only supporting 4TC DCB when it is enabled in addition to SR-IOV. Cc: Greg Rose <gregory.v.rose@intel.com> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
4ae63730 |
|
22-Jun-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Update the logic for ixgbe_cache_ring_dcb and DCB RSS configuration This change cleans up some of the logic in an attempt to try and simplify things for how we are configuring DCB w/ RSS. In this patch I basically did 3 things. I updated the logic for getting the first register index. I applied the fact that all TCs get the same number of queues to simplify the looping logic in caching the DCB ring register. Finally I updated how we configure the RQTC register to match the fact that all TCs are assigned the same number of queues. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
ac802f5d |
|
11-Jul-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Move configuration of set_real_num_rx/tx_queues into open It makes much more sense for us to configure the real number of Tx and Rx queues in the ixgbe_open call than it does in ixgbe_set_num_queues. By setting the number in ixgbe_open we can avoid a number of unecessary updates and only have to make the calls once. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
d411a936 |
|
29-Jun-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config This change merges the ixgbe_cache_ring_fcoe and ixgbe_set_fcoe_queues logic into the DCB and RSS initialization calls. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
800bd607 |
|
01-Jun-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Add function for obtaining FCoE TC based on FCoE user priority In upcoming patches it will become increasingly common to need to determine the FCoE traffic class in order to determine the correct queues for FCoE. In order to make this easier I am adding a function for obtaining the FCoE traffic class based on the user priority. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
0b7f5d0b |
|
04-May-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Merge RSS and flow director ring register caching and configuration There are really only 3 modes that can control the number of queues. Those are RSS, DCB, and VMDq/SR-IOV. Currently we have things much more broken up than they need to be for how we are configuring the rings. In order to try and straiten some of this out I am going to start merging similar functionality into single functions. To start with I am merging the Flow Director ring configuration into the RSS ring configuration since Flow Director cannot function with DCB or SR-IOV. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
e4b317e9 |
|
04-May-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Add feature offset value to ring features The mask value for ring features was overloaded for FCoE which can lead to some confusion. In order to avoid any confusion I am splitting the mask value and adding an offset value. This can be used for the start of the FCoE rings, and in the future I hope to use it to store the start of the registers for SR-IOV. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
c087663e |
|
09-May-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Add upper limit to ring features We are currently using indices to indicate the upper limit on a ring feature. However since we can switch back and forth on features such as DCB and that has effects on other features such as RSS it is preferable to instead store the upper limit separate from the current value for the number of rings related to the feature. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
49c7ffbe |
|
04-May-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: count q_vectors instead of MSI-X vectors It makes much more sense for us to count q_vectors instead of MSI-X vectors. We were using num_msix_vectors to find the number of q_vectors in multiple places. This was wasteful since we only had one place that actually needs the number of MSI-X vectors and that is in slow path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
49ce9c2c |
|
10-Jul-2012 |
Ben Hutchings <bhutchings@solarflare.com> |
drivers/net/ethernet: Fix (nearly-)kernel-doc comments for various functions Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Delete a few that are content-free. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
57efd44c |
|
25-Jun-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP FCoE target mode was experiencing issues due to the fact that we were sending up data frames that were padded to 60 bytes after the DDP logic had already stripped the frame down to 52 or 56 depending on the use of VLANs. This was resulting in the FCoE DDP logic having issues since it thought the frame still had data in it due to the padding. To resolve this, adding code so that we do not pad FCoE frames prior to handling them to the stack. CC: <stable@vger.kernel.org> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d0bfcdfd |
|
28-Mar-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Reorder the ring to q_vector mapping to improve performance This change reorders the mapping of rings to q_vectors in the case that the number of rings exceeds the number of q_vectors. Previously we would allocate the first R/N queues to the first q_vector where R is the number of rings and N is the number of q_vectors. Instead of doing this we can do a better job of interleaving the rings to the CPUs by assigning every Nth ring to the q_vector. The below tables illustrate this change for the R = 16 N = 4 case. Before patch After patch q_vector: 0 1 2 3 0 1 2 3 Rings: 0 4 8 12 0 1 2 3 1 5 9 13 4 5 6 7 3 6 10 14 8 9 10 11 4 7 11 15 12 13 14 15 This should improve the performance for both DCB or ATR when the number of rings exceeds the number of q_vectors allocated by the adapter. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
b2db497e |
|
06-Apr-2012 |
Alexander Duyck <alexander.h.duyck@intel.com> |
ixgbe: Identify FCoE rings earlier to resolve memory corruption w/ FCoE This patch makes it so that we identify FCoE rings earlier than ixgbe_set_rx_buffer_len. Instead we identify the Rx FCoE rings at allocation time in ixgbe_alloc_q_vector. The motivation behind this change is to avoid memory corruption when FCoE is enabled. Without this change we were initializing the rings at 0, and 2K on systems with 4K pages, then when we bumped the buffer size to 4K with order 1 pages we were accessing offsets 2K and 6K instead of 0 and 4K. This was resulting in memory corruptions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
8af3c33f |
|
18-Feb-2012 |
Jeff Kirsher <jeffrey.t.kirsher@intel.com> |
ixgbe: fix namespace issues when FCoE/DCB is not enabled Resolve namespace issues when FCoE or DCB is not enabled. The issue is with certain configurations we end up with namespace problems. A simple example: ixgbe_main.c - defines func A() - uses func A() ixgbe_fcoe.c - uses func A() ixgbe.h - has prototype for func A() For default (FCoE included) all is good. But when it isn't the namespace checker complains about how func A() could be static. To resolve this, created a ixgbe_lib file to contain functions used by DCB/FCoE and their helper functions so that they are always in namespace whether or not DCB/FCoE is enabled. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
|