#
ed34a6b6 |
|
17-Jan-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add subinterface interrupt allocation function The ice(4) driver will add the ability to create extra interfaces that hang off of the base interface; to do that the driver requires a method for the subinterface to request hardware interrupt resources from the base interface. Signed-off-by: Eric Joyner <erj@FreeBSD.org> MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D39930
|
#
3c7da27a |
|
22-Mar-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add sysctl to request extra MSIX vectors on driver load Intended to be used with upcoming feature to add sub-interfaces, since those new interfaces will be dynamically created and will need to have spare MSI-X interrupts already allocated for them on driver load. This sysctl is marked as a tunable since it will need to be set before the driver is loaded since MSI-X interrupt allocation and setup is done during the attach process. Signed-off-by: Eric Joyner <erj@FreeBSD.org> MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D41326
|
#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
7f527d48 |
|
04-Aug-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Fix white space and reduce some line lengths This helps align some of the code with the rest of the style used in iflib, but as marius@ points out, this is not style(9). Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: kbowling@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D41324
|
#
7ff9ae90 |
|
03-Aug-2023 |
Marius Strobl <marius@FreeBSD.org> |
iflib(9): Remove support for cloning pseudo interfaces This code was used by the first incarnation of wg(4) and is dead ever since f187d6dfbf633665ba6740fe22742aec60ce02a2 has removed the latter again. Moreover, this code matched iflib(4) like a square peg fits in a round hole, was incomplete and despite some hacks still tailored to VPC and wg(4) but not generic. In effect, this reverts the following: 09f6ff4f1a47c3009dc16fdc609a44f2341bc7ac (w/ its "ancillary changes") 9aeca21324f481f57f2ecb7009f461f4f51b62b3 1f93e931d9f0c688f43f98ef777e04636a325526 0f9544d03e89d180f94a7a84b110ec7d2b6c625a 0dd691b41276ce13d25ffb1443af27f85038aa3f Reviewed by: erj, kbowling Differential Revision: <https://reviews.freebsd.org/D41196>
|
#
9c950139 |
|
17-Oct-2022 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Introduce v2 of TX Queue Select Functionality For v2, iflib will parse packet headers before queueing a packet. This commit also adds a new field in the structure that holds parsed header information from packets; it stores the IP ToS/traffic class field found in the IPv4/IPv6 header. To help, it will only partially parse header packets before queueing them by using a new header parsing function that does less than the current parsing header function; for our purposes we only need up to the minimal IP header in order to get the IP ToS infromation and don't need to pull up more data. For now, v1 and v2 co-exist in this patch; v1 still offers a less-invasive method where none of the packet is parsed in iflib before queueing. This also bumps the sys/param.h version. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Tested by: IntelNetworking MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D34742
|
#
213e9139 |
|
29-Jul-2021 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Allow drivers to determine which queue to TX on Adds a new function pointer to struct if_txrx in order to allow drivers to set their own function that will determine which queue a packet should be sent on. Since this includes a kernel ABI change, bump the __FreeBSD_version as well. (This motivation behind this is to allow the driver to examine the UP in the VLAN tag and determine which queue to TX on based on that, in support of HW TX traffic shaping.) Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: kbowling@, stallamr@netapp.com Tested by: jeffrey.e.pieper@intel.com Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D31485
|
#
58632fa7 |
|
19-May-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: Add a new quirk ENETC NIC found in LS1028A has a bug where clearing TX pidx/cidx causes the ring to hang after being re-enabled. Add a new flag, if set iflib will preserve the indices during restart. Submitted by: Kornel Duleba <mindal@semihalf.com> Reviewed by: gallatin, erj Obtained from: Semihalf Sponsored by: Alstom Group Differential Revision: https://reviews.freebsd.org/D30728
|
#
ffe3def9 |
|
07-Mar-2021 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Make if_shared_ctx_t a pointer to const This structure is shared among multiple instances of a driver, so we should ensure that it doesn't somehow get treated as if there's a separate instance per interface. This is especially important for software-only drivers like wg. DEVICE_REGISTER() still returns a void * and so the per-driver sctx structures are not yet defined with the const qualifier. Reviewed by: gallatin, erj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29102
|
#
09c3f04f |
|
02-Mar-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: add support for admin completion queues For interfaces with admin completion queues, introduce a new devmethod IFDI_ADMIN_COMPLETION_HANDLE and a corresponding flag IFLIB_HAS_ADMINCQ. This provides an option for handling any admin cq logic, which cannot be run from an interrupt context. Said method is called from within iflib's admin task, making it safe to sleep. Reviewed by: mmacy Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D28708
|
#
6dd69f00 |
|
24-Feb-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: introduce isc_dma_width Some DMA controllers are unable to address the full host memory space and are instead limited to a subset of address range (e.g. 48-bit). Allow the driver to specify the maximum allowed DMA addressing width (in bits) for the NIC hardware, by introducing a new field in if_softc_ctx. If said field is omitted (set to 0), the lowaddr of DMA window bounds defaults to BUS_SPACE_MAXADDR. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D28706
|
#
81be6552 |
|
18-Dec-2020 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: ensure that tx interrupts enabled and cleanups Doing a 'dd' over iscsi will reliably cause stalls. Tx cleaning _should_ reliably happen as data is sent. However, currently if the transmit queue fills it will wait until the iflib timer (hz/2) runs. This change causes the the tx taskq thread to be run if there are completed descriptors. While here: - make timer interrupt delay a sysctl - simplify txd_db_check handling - comment on INTR types Background on the change: Initially doorbell updates were minimized by only writing to the register on every fourth packet. If txq_drain would return without writing to the doorbell it scheduled a callout on the next tick to do the doorbell write to ensure that the write otherwise happened "soon". At that time a sysctl was added for users to avoid the potential added latency by simply writing to the doorbell register on every packet. This worked perfectly well for e1000 and ixgbe ... and appeared to work well on ixl. However, as it turned out there was a race to this approach that would lockup the ixl MAC. It was possible for a lower producer index to be written after a higher one. On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial response was to add a lock around doorbell writes - fixing the problem but adding an unacceptable amount of lock contention. The next iteration was to use transmit interrupts to drive delayed doorbell writes. If there were no packets in the queue all doorbell writes would be immediate as the queue started to fill up we could delay doorbell writes further and further. At the start of drain if we've cleaned any packets we know we've moved the state machine along and we write the doorbell (an obvious missing optimization was to skip that doorbell write if db_pending is zero). This change required that tx interrupts be scheduled periodically as opposed to just when the hardware txq was full. However, that just leads to our next problem. Initially dedicated msix vectors were used for both tx and rx. However, it was often possible to use up all available vectors before we set up all the queues we wanted. By having rx and tx share a vector for a given queue we could halve the number of vectors used by a given configuration. The problem here is that with this change only e1000 passed the necessary value to have the fast interrupt drive tx when appropriate. Reported by: mav@ Tested by: mav@ Reviewed by: gallatin@ MFC after: 1 month Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D27683
|
#
662c1305 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
6d84e76a |
|
12-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: improve rxsync to support IFLIB_HAS_RXCQ For drivers with IFLIB_HAS_RXCQ set, there is a separate completion queue. In this case, the netmap rxsync routine needs to update rxq->ifr_cq_cidx in the same way it is updated by iflib_rxeof(). This improves the situation for vmx(4) and bnxt(4) drivers, which use iflib and have the IFLIB_HAS_RXCQ bit set. PR: 248494 MFC after: 3 weeks
|
#
b256d25c |
|
06-Jul-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Fix some nits in the rx refill code. - Get rid of the ifl_vm_addrs array. It is not used by any existing consumer, so we are just dirtying a couple of cache lines for no reason. - Use uma_zalloc(fl->ifl_zone) instead of m_cljget(). Otherwise m_cljget() is doing unnecessary work to look up the correct zone, when iflib already knows what that zone is. - ifl_gen is only used when INVARIANTS is on, so make that more clear. - Fix some style nits and inconsistencies. Reviewed by: gallatin Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25490
|
#
9aeca213 |
|
21-Jun-2020 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: fix cloneattach fail and generalize pseudo device handling - a cloneattach failure will not currently be handled correctly, jump to the right target - pseudo devices are all treat as if they're ethernet devices - this often doesn't make sense MFC after: 1 week Sponsored by: Netgate, Inc. Differential Revision: https://reviews.freebsd.org/D25083
|
#
45818bf1 |
|
27-Apr-2020 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Stop interface before (un)registering VLAN This patch is intended to solve a specific problem that iavf(4) encounters, but what it does can be extended to solve other issues. To summarize the iavf(4) issue, if the PF driver configures VLAN anti-spoof, then the VF driver needs to make sure no untagged traffic is sent if a VLAN is configured, and vice-versa. This can be an issue when a VLAN is being registered or unregistered, e.g. when a packet may be on the ring with a VLAN in it, but the VLANs are being unregistered. This can cause that tagged packet to go out and cause an MDD event. To fix this, include a new interface-dependent function that drivers can implement named IFDI_NEEDS_RESTART(). Right now, this function is called in iflib_vlan_unregister/register() to determine whether the interface needs to be stopped and started when a VLAN is registered or unregistered. The default return value of IFDI_NEEDS_RESTART() is true, so this fixes the MDD problem that iavf(4) encounters, since the interface rings are flushed during a stop/init. A future change to iavf(4) will implement that function just in case the default value changes, and to make it explicit that this interface reset is required when a VLAN is added or removed. Reviewed by: gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22086
|
#
704101dd |
|
24-Mar-2020 |
Conrad Meyer <cem@FreeBSD.org> |
Fix PNP matching for iflib NIC drivers The previous descriptor string specified that all fields were significant for match. However, the only significant fields for in-tree drivers are vendor:devid, and the fictitious zero values constructed by PVID() did not match real subvendor, subdevice, revision, and/or class values, resulting in no automatic probe. If a future iflib driver needs to match on other criteria, the descriptor string can be updated accordingly. (E.g., "V32" and ~0 for unspecified values in PVID().) Reported by: mav Sponsored by: Dell EMC Isilon
|
#
b3813609 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Allow iflib drivers to specify the buffer size used for each receive queue Reviewed by: erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23947
|
#
41669133 |
|
30-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Add IFLIB_SINGLE_IRQ_RX_ONLY. As of r347221 the iflib legacy interrupt mode setup assumes that drivers perform both receive and transmit processing from the interrupt handler. This assumption is invalid in the vmxnet3 driver, so introduce the IFLIB_SINGLE_IRQ_RX_ONLY flag to make iflib avoid tx processing in the interrupt handler. PR: 239118 Reported and tested by: Juraj Lutter <otis@sk.freebsd.org> Obtained from: marius Reviewed by: gallatin MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21831
|
#
d49e83ea |
|
15-Jun-2019 |
Marius Strobl <marius@FreeBSD.org> |
- Replace unused and only ever written to members of public iflib(9) structs with placeholders (in the latter case, IFLIB_MAX_TX_BYTES etc. are also only ever used for these write-only members if at all, so both these macros and members can just go). Using these spares may render it possible to merge certain iflib(9) fixes to stable/12. Otherwise, changes extending struct if_irq or struct if_shared_ctx in any way would break KBI as instances of these are allocated by the driver front-ends (by contrast, struct if_pkt_info as well as struct if_softc_ctx instances are provided by iflib(9) and, thus, may grow at least at the end without breaking KBI). - Make the pvi_name in struct pci_vendor_info const char * as device identifiers in hardware lookup tables aren't to be expected to ever change at runtime. - Similarly, make the pci_vendor_info_t of struct if_shared_ctx which is used to point to the struct pci_vendor_info arrays provided by the driver front-ends const. - Remove the ETH_ADDR_LEN macro from iflib.h; this was duplicating ETHER_ADDR_LEN of <net/ethernet.h> with iflib(9) actually only consuming the latter macro. - Make the name argument of iflib_io_tqg_attach(9) const, matching the taskqgroup_attach_cpu(9) this function wraps as well as e. g. iflib_config_gtask_init(9). - Remove the orphaned iflib_qset_lock_get() prototype. - Remove some extraneous empty lines.
|
#
668d6dbb |
|
29-May-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: provide probe wrapper for vendor drivers From Jake: Vendor drivers that exist out-of-tree generally should return BUS_PROBE_VENDOR from their device probe functions. This helps ensure that a vendor replacement driver will supersede the in-kernel driver for a given device. Currently, if a vendor wants to implement a driver based on iflib, it will always report BUS_PROBE_DEFAULT. Add a wrapper function, iflib_device_probe_vendor() which can be used in place of iflib_device_probe(). This function will just return BUS_PROBE_VENDOR whenever iflib_device_probe() would return BUS_PROBE_DEFAULT. While vendor drivers can already implement such a wrapper themselves, providing it in the iflib.h header makes it easier for the vendor driver to do the right thing. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, marius@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D20221
|
#
1722eeac |
|
06-May-2019 |
Marius Strobl <marius@FreeBSD.org> |
- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx. - Remove the only ever written to ift_db_mtx_name member of struct iflib_txq. - Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen and ifr_lro_enabled members of struct iflib_rxq. - Consistently spell DMA, RX and TX uppercase in comments, messages etc. instead of mixing with some lowercase variants. - Consistently use if_t instead of a mix of if_t and struct ifnet pointers. - Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and iflib_fl_setup() in line with reality. - Judging problem reports, people are wondering what on earth messages like: "TX(0) desc avail = 1024, pidx = 0" are trying to indicate. Thus, extend this string to be more like that of non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due to which the interface will be reset. - Take advantage of the M_HAS_VLANTAG macro. - Use false/true rather than FALSE/TRUE for variables of type bool. - Use FALLTHROUGH as advocated by style(9).
|
#
e2621d96 |
|
03-May-2019 |
Matt Macy <mmacy@FreeBSD.org> |
Allow iflib drivers to pass a pointer to their own ifmedia structure. Tested by: emaste@ Differential Revision: https://reviews.freebsd.org/D19946
|
#
10a1e981 |
|
19-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: mark isc_driver_version as constant From Jake: The iflib core never modifies the isc_driver_version string. Allow drivers to safely assign pointers to constant buffers by marking this parameter const. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, jhb@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19577
|
#
1b9d9394 |
|
19-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: expose the Rx mbuf buffer size to drivers From Jake: iflib_fl_setup calculates a suitable buffer size for the Rx mbufs based on the isc_max_frame_size value that drivers setup. This calculation is repeated by drivers when programming their hardware with the size of each Rx buffer. This can lead to a mismatch where the iflib mbuf size is different from the expected size of the buffer as programmed by the hardware. This can lead to unexpected results. If iflib ever wants to support mbuf sizes larger than one page, every driver must be updated to account for the new possible buffer sizes. Fix this by calculating the mbuf size prior to calling IFDI_INIT, and adding the iflib_get_rx_mbuf_sz function which will expose this value to drivers, so that they do not repeat the same calculation. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, erj@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19489
|
#
8f82136a |
|
21-Jan-2019 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
onvert vmx(4) to being an iflib driver. Also, expose IFLIB_MAX_RX_SEGS to iflib drivers and add iflib_dma_alloc_align() to the iflib API. Performance is generally better with the tunable/sysctl dev.vmx.<index>.iflib.tx_abdicate=1. Reviewed by: shurd MFC after: 1 week Relnotes: yes Sponsored by: RG Nets Differential Revision: https://reviews.freebsd.org/D18761
|
#
77c1fcec |
|
12-Oct-2018 |
Eric Joyner <erj@FreeBSD.org> |
ixl/iavf(4): Change ixlv to iavf and update it to use iflib(9) Finishes the conversion of the 40Gb Intel Ethernet drivers to iflib(9) for FreeBSD 12.0, and fixes numerous bugs in both ixl(4) and iavf(4). This commit also re-adds the VF driver to GENERIC since it now compiles and functions. The VF driver name was changed from ixlv(4) to iavf(4) because the VF driver is now intended to be used with future products, not just with Fortville/Fort Park VFs. A man page update that documents these drivers is forthcoming in a separate commit. Reviewed by: sbruno@, kbowling@ Tested by: jeffrey.e.pieper@intel.com Approved by: re (gjb@) Relnotes: yes Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D16429
|
#
329e817f |
|
26-Sep-2018 |
Warner Losh <imp@FreeBSD.org> |
Reapply, with minor tweaks, r338025, from the original commit: Remove unused and easy to misuse PNP macro parameter Inspired by r338025, just remove the element size parameter to the MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to have correct pointer (or array) type. Since all invocations of the macro already had this property and the emitted PNP data continues to include the element size, there is no functional change. Mostly done with the coccinelle 'spatch' tool: $ cat modpnpsize0.cocci @normaltables@ identifier b,c; expression a,d,e; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,d, -sizeof(d[0]), e); @singletons@ identifier b,c,d; expression a; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,&d, -sizeof(d), 1); $ rg -l MODULE_PNP_INFO -- sys | \ xargs spatch --in-place --sp-file modpnpsize0.cocci (Note that coccinelle invokes diff(1) via a PATH search and expects diff to tolerate the -B flag, which BSD diff does not. So I had to link gdiff into PATH as diff to use spatch.) Tinderbox'd (-DMAKE_JUST_KERNELS). Approved by: re (glen)
|
#
b8e771e9 |
|
18-Aug-2018 |
Conrad Meyer <cem@FreeBSD.org> |
Back out r338035 until Warner is finished churning GSoC PNP patches I was not aware Warner was making or planning to make forward progress in this area and have since been informed of that. It's easy to apply/reapply when churn dies down.
|
#
faa31943 |
|
18-Aug-2018 |
Conrad Meyer <cem@FreeBSD.org> |
Remove unused and easy to misuse PNP macro parameter Inspired by r338025, just remove the element size parameter to the MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to have correct pointer (or array) type. Since all invocations of the macro already had this property and the emitted PNP data continues to include the element size, there is no functional change. Mostly done with the coccinelle 'spatch' tool: $ cat modpnpsize0.cocci @normaltables@ identifier b,c; expression a,d,e; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,d, -sizeof(d[0]), e); @singletons@ identifier b,c,d; expression a; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,&d, -sizeof(d), 1); $ rg -l MODULE_PNP_INFO -- sys | \ xargs spatch --in-place --sp-file modpnpsize0.cocci (Note that coccinelle invokes diff(1) via a PATH search and expects diff to tolerate the -B flag, which BSD diff does not. So I had to link gdiff into PATH as diff to use spatch.) Tinderbox'd (-DMAKE_JUST_KERNELS).
|
#
7f87c040 |
|
15-Jul-2018 |
Marius Strobl <marius@FreeBSD.org> |
Assorted TSO fixes for em(4)/iflib(9) and dead code removal: - Ever since the workaround for the silicon bug of TSO4 causing MAC hangs was committed in r295133, CSUM_TSO always got disabled unconditionally by em(4) on the first invocation of em_init_locked(). However, even with that problem fixed, it turned out that for at least e. g. 82579 not all necessary TSO workarounds are in place, still causing MAC hangs even at Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled in r323292 (r323293 for stable/10) for the EM-class by default, allowing users to turn it on if it happens to work with their particular EM MAC in a Gigabit-only environment. In head, the TSO workaround for speeds other than Gigabit was lost with the conversion to iflib(9) in r311849 (possibly along with another one or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4 got enabled by default again, causing device hangs. Therefore, change the default for this hardware class back to have TSO4 off, allowing users to turn it on manually if it happens to work in their environment as we do in stable/{10,11}. An alternative would be to add a whitelist of EM-class devices where TSO4 actually is reliable with the workarounds in place, but given that the advantage of TSO at Gigabit speed is rather limited - especially with the overhead of these workarounds -, that's really not worth it. [1] This change includes the addition of an isc_capabilities to struct if_softc_ctx so iflib(9) can also handle interface capabilities that shouldn't be enabled by default which is used to handle the default-off capabilities of e1000 as suggested by shurd@ and moving their handling from em_setup_interface() to em_if_attach_pre() accordingly. - Although 82543 support TSO4 in theory, the former lem(4) didn't have support for TSO4, presumably because TSO4 is even more broken in the LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class devices was enabled as part of the conversion to iflib(9) in r311849, causing device hangs. So revert back to the pre-r311849 behavior of not supporting TSO4 for LEM-class at all, which includes not creating a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2] - In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET (65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO DMA must have a maxsize of the maximum TSO size plus the size of a VLAN header for software VLAN tagging. The iflib(9) converted em(4), thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET in em_setup_interface() (apparently, left-over from pre-iflib(9) times). So remove the later and correct iflib(9) to correctly cap the maximum TSO size reported to the stack at IP_MAXPACKET. While at it, let iflib(9) use if_sethwtsomax*(). This change includes the addition of isc_tso_max{seg,}size DMA engine constraints for the TSO DMA tag to struct if_shared_ctx and letting iflib_txsd_alloc() automatically adjust the maxsize of that tag in case IFCAP_VLAN_MTU is supported as requested by shurd@. - Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet header length from {ixgbe,ixl,ixlv,ixv,em}_setup_interface() to iflib(9) so adjustment is automatically done in case IFCAP_VLAN_MTU is supported. As a consequence, this adjustment now is also done in case of bnxt(4) which missed it previously. - Move the reduction of the maximum TSO segment count reported to the stack by the number of m_pullup(9) calls (which in the worst case, can add another mbuf and, thus, the requirement for another DMA segment each) in the transmit path for performance reasons from em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now done in iflib_parse_header() rather than in the no longer existing em_xmit(). Moreover, this optimization applies to all drivers using iflib(9) and not just em(4); all in-tree iflib(9) consumers still have enough room to handle full size TSO packets. Also, reduce the adjustment to the maximum number of m_pullup(9)'s now performed in iflib_parse_header(). - Prior to the conversion of em(4)/igb(4)/lem(4) and ixl(4) to iflib(9) in r311849 and r335338 respectively, these drivers didn't enable IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER was turned on by default but also lagg(4) was fixed in that regard in r203548. So just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling in {em,ixl,ixlv}_setup_interface(). - Nuke other redundant IFCAP_* setting in {em,ixl,ixlv}_setup_interface() which is (more completely) already done in {em,ixl,ixlv}_if_attach_pre() now. - Remove some redundant/dead setting of scctx->isc_tx_csum_flags in em_if_attach_pre(). - Remove some IFCAP_* duplicated either directly or indirectly (e. g. via IFCAP_HWCSUM) in {EM,IGB,IXL}_CAPS. - Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as iflib(9) adds that capability unconditionally. - Remove some unused macros from em(4). - Bump __FreeBSD_version as some of the above changes require the modules of drivers using iflib(9) to be recompiled. Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1] Reviewed by: sbruno (earlier version), erj PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3) [2] Differential Revision: https://reviews.freebsd.org/D15720
|
#
3e0e6330 |
|
29-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: mark irq allocation name parameter as constant The *name parameter passed to iflib_irq_alloc_generic and iflib_softirq_alloc_generic is never modified. Many places in code pass string literals and thus should not be modified. Mark the *name parameter as a const char * instead, so that we enforce that the name is not modified before passing to bus_describe_intr() Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: kmacy Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15343
|
#
1d7ef186 |
|
25-May-2018 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add new shared flag: IFLIB_ADMIN_ALWAYS_RUN ixl(4)'s nvmupdate utility expects the nvmupdate process to run while the interface is down; these nvm update commands use the admin queue, so the admin queue needs to be able to generate interrupts and be processed while the interface is down. So add a flag that ixl(4) sets that lets the entire admin task run even when the interface is marked down/IFF_DRV_RUNNING isn't set. With this change, nvmupdate should function like it did pre-iflib. Reviewed by: gallatin@, sbruno@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15575
|
#
09f6ff4f |
|
11-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
iflib(9): Add support for cloning pseudo interfaces Part 3 of many ... The VPC framework relies heavily on cloning pseudo interfaces (vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc). This pulls in that piece. Some ancillary changes get pulled in as a side effect. Reviewed by: shurd@ Approved by: sbruno@ Sponsored by: Joyent, Inc. Differential Revision: https://reviews.freebsd.org/D15347
|
#
aa8a24d3 |
|
03-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Allow iflib NIC drivers to sleep rather than busy wait Since the move to SMP NIC driver locking has had to go through serious contortions using mtx around long running hardware operations. This moves iflib past that. Individual drivers may now sleep when appropriate. Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14983
|
#
81ad57b1 |
|
21-Feb-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
IFLIB: Make isc_magic unsigned The IFLIB_MAGIC macro is > INT_MAX, so isc_magic should be able to contain it. Reported by: jeb Sponsored by: Limelight Networks
|
#
79554b40 |
|
22-Dec-2017 |
Warner Losh <imp@FreeBSD.org> |
The device tables end with a sentinel in iflib. Don't include the sentinel in the output.
|
#
d2064cf0 |
|
22-Dec-2017 |
Warner Losh <imp@FreeBSD.org> |
Use '#' rather than some made up name for fields we want to ignore.
|
#
96fc97c8 |
|
19-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Update Matthew Macy contact info Email address has changed, uses consistent name (Matthew, not Matt) Reported by: Matthew Macy <mmacy@mattmacy.io> Differential Revision: https://reviews.freebsd.org/D13537
|
#
d14c853b |
|
05-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: Support to padding Ethernet frames to a min size Some bnxt devices do not correctly send frames smaller than 52 bytes (without CRC), so add a quirk that will pad frames to an arbitrary size before passing off to the encap routine. Reported by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13269
|
#
1c0054d2 |
|
05-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix "taskqgroup_attach: setaffinity failed: 3" with iflib drivers Improved logging added in r323879 exposed an error during attach. We need the irq, not the rid to work correctly. em uses shared irqs, so it will use the same irq for TX as RX. bnxt does not use shared irqs, or TX irqs at all, so there's no need to set the TX irq affinity. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12496
|
#
916616c4 |
|
26-Sep-2017 |
Conrad Meyer <cem@FreeBSD.org> |
Add PNP metadata to more drivers GPUs: radeonkms, i915kms NICs: if_em, if_igb, if_bnxt This metadata isn't used yet, but it will be handy to have later to implement automatic module loading. Reviewed by: imp, mmacy Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12488
|
#
c5cf2172 |
|
22-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Some small packet performance improvements If the packet is smaller than MTU, disable the TSO flags. Move TCP header parsing inside the IS_TSO?() test. Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12442
|
#
ab2e3f79 |
|
15-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Revert r323516 (iflib rollup) This was really too big of a commit even if everything worked, but there are multiple new issues introduced in the one huge commit, so it's not worth keeping this until it's fixed. I'll work on splitting this up into logical chunks and introduce them one at a time over the next week or two. Approved by: sbruno (mentor) Sponsored by: Limelight Networks
|
#
d300df01 |
|
12-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Roll up iflib commits from github. This pulls in most of the work done by Matt Macy as well as other changes which he has accepted via pull request to his github repo at https://github.com/mattmacy/networking/ This should bring -CURRENT and the github repo into close enough sync to allow small feature branches rather than a large chain of interdependant patches being developed out of tree. The reset of the synchronization should be able to be completed on github by splitting the remaining changes that are not yet ready into short feature branches for later review as smaller commits. Here is a summary of changes included in this patch: 1) More checks when INVARIANTS are enabled for eariler problem detection 2) Group Task Queue cleanups - Fix use of duplicate shortdesc for gtaskqueue malloc type. Some interfaces such as memguard(9) use the short description to identify malloc types, so duplicates should be avoided. 3) Allow gtaskqueues to use ithreads in addition to taskqueues - In some cases, this can improve performance 4) Better logging when taskqgroup_attach*() fails to set interrupt affinity. 5) Do not start gtaskqueues until they're needed 6) Have mp_ring enqueue function enter the ABDICATED rather than BUSY state. This moves the TX to the gtaskq and allows processing to continue faster as well as make TX batching more likely. 7) Add an ift_txd_errata function to struct if_txrx. This allows drivers to inspect/modify mbufs before transmission. 8) Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need checksums zeroed for checksum offload to work. This avoids modifying packet data in the TX path when possible. 9) Use ithreads for iflib I/O instead of taskqueues 10) Clean up ioctl and support async ioctl functions 11) Prefetch two cachlines from each mbuf instead of one up to 128B. We often need to parse packet header info beyond 64B. 12) Fix potential memory corruption due to fence post error in bit_nclear() usage. 13) Improved hang detection and handling 14) If the packet is smaller than MTU, disable the TSO flags. This avoids extra packet parsing when not needed. 15) Move TCP header parsing inside the IS_TSO?() test. This avoids extra packet parsing when not needed. 16) Pass chains of mbufs that are not consumed by lro to if_input() rather call if_input() for each mbuf. 17) Re-arrange packet header loads to get as much work as possible done before a cache stall. 18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/ IFDI_DETACH(); 19) Attempt to distribute RX/TX tasks across cores more sensibly, especially when RX and TX share an interrupt. RX will attempt to take the first threads on a core, and TX will attempt to take successive threads. 20) Allow iflib_softirq_alloc_generic() to request affinity to the same cpus an interrupt has affinity with. This allows TX queues to ensure they are serviced by the socket the device is on. 21) Add new iflib sysctls to net.iflib: - timer_int - interval at which to run per-queue timers in ticks - force_busdma 22) Add new per-device iflib sysctls to dev.X.Y.iflib - rx_budget allows tuning the batch size on the RX path - watchdog_events Count of watchdog events seen since load 23) Fix error where netmap_rxq_init() could get called before IFDI_INIT() 24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY when waiting for firmware - After interrupts are enabled, convert all waits to sleeps - Eliminates e1000 software/firmware synchronization busy waits after startup 25) e1000: Remove special case for budget=1 in em_txrx.c - Premature optimization which may actually be incorrect with multi-segment packets 26) e1000: Split out TX interrupt rather than share an interrupt for RX and TX. - Allows better performance by keeping RX and TX paths separate 27) e1000: Separate igb from em code where suitable Much easier to understand separate functions and "if (is_igb)" than previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))" #blamebruno Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12235
|
#
a9693502 |
|
30-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert r323008 and its conversion of e1000/iflib to using SX locks. This seems to be missing something on the 82574L causing NFS root mounts to hang. Reported by: kib
|
#
e17e5b41 |
|
29-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Continuation of lock cleanup in e1000. Post-cold sleep instead of DELAY when waiting for firmware. Convert softc mutex to an SX lock. Change all waits to sleeps once interrupts are enabled (and it is safe to sleep). Submitted by: Matt Macy <matt@mattmacy.io> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12101
|
#
9bc7588c |
|
27-Jul-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Deprecate unused int isc_max_txqsets and int isc_max_rxqsets as they were redundant and not being used to set anything up. Submitted by: Matt Macy <mmacy@mattmacy.io> Reported by: Jeb Cramer <cramerj@intel.com> Sponsored by: Limelight Networks
|
#
eb36b1d0 |
|
30-Jun-2017 |
Jason A. Harmening <jah@FreeBSD.org> |
Clean up MD pollution of bus_dma.h: --Remove special-case handling of sparc64 bus_dmamap* functions. Replace with a more generic mechanism that allows MD busdma implementations to generate inline mapping functions by defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This is currently useful for sparc64, x86, and arm64, which all implement non-load dmamap operations as simple wrappers around map objects which may be bus- or device-specific. --Remove NULL-checked bus_dmamap macros. Implement the equivalent NULL checks in the inlined x86 implementation. For non-x86 platforms, these checks are a minor pessimization as those platforms do not currently allow NULL maps. NULL maps were originally allowed on arm64, which appears to have been the motivation behind adding arm[64]-specific barriers to bus_dma.h, but that support was removed in r299463. --Simplify the internal interface used by the bus_dmamap_load* variants and move it to bus_dma_internal.h --Fix some drivers that directly include sys/bus_dma.h despite the recommendations of bus_dma(9) Reviewed by: kib (previous revision), marius Differential Revision: https://reviews.freebsd.org/D10729
|
#
60596476 |
|
06-Apr-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Move pause frame counter out of struct if_ctx and into struct if_softc_ctx_t so that we can use it in iflib to detect pause frames. The igb(4) driver definitely used to use this in its old timer function and I see no reason to restrict it to that driver only. Sponsored by: Limelight Networks
|
#
ea351d3f |
|
04-Apr-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Allow MSIX to be turned off by tuneable per interface, per driver. Sponsored by: Limelight Networks
|
#
95246abb |
|
13-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
IFLIB updates - unconditionally enable BUS_DMA on non-x86 architectures - speed up rxd zeroing via customized function - support out of order updates to rxd's - add prefetching to hardware descriptor rings - only prefetch on 10G or faster hardware - add seperate tx queue intr function - preliminary rework of NETMAP interfaces, WIP Submitted by: Matt Macy <mmacy@nextbsd.org> Sponsored by: Limelight Networks
|
#
e035717e |
|
27-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
IFLIB updates: We found routing performance dropped significantly when configuring FreeBSD as a router, we are applying the following changes in order to resolve those issues and hopefully perform better. - don't prefetch the flags array, we usually don't need it - prefetch the next cache line of each of the software descriptor arrays as well as the first cache line of each of the next four packets' mbufs and clusters - reduce max copy size to 63 bytes - convert rx soft descriptors from array of structures to a structure of arrays - update copyrights Submitted by: Matt Macy <mmacy@nextbsd.org>
|
#
1248952a |
|
01-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
2017 IFLIB updates in preparation for commits to e1000 and ixgbe. - iflib - add checksum in place support (mmacy) - iflib - initialize IP for TSO (going to be needed for e1000) (mmacy) - iflib - move isc_txrx from shared context to softc context (mmacy) - iflib - Normalize checks in TXQ drainage. (shurd) - iflib - Fix queue capping checks (mmacy) - iflib - Fix invalid assert, em can need 2 sentinels (mmacy) - iflib - let the driver determine what capabilities are set and what tx csum flags are used (mmacy) - add INVARIANTS debugging hooks to gtaskqueue enqueue (mmacy) - update bnxt(4) to support the changes to iflib (shurd) Some other various, sundry updates. Slightly more verbose changelog: Submitted by: mmacy@nextbsd.org Reviewed by: shurd mFC after: Sponsored by: LimeLight Networks and Dell EMC Isilon
|
#
23ac9029 |
|
12-Aug-2016 |
Stephen Hurd <shurd@FreeBSD.org> |
Update iflib to support more NIC designs - Move group task queue into kern/subr_gtaskqueue.c - Change intr_enable to return an int so it can be detected if it's not implemented - Allow different TX/RX queues per set to be different sizes - Don't split up TX mbufs before transmit - Allow a completion queue for TX as well as RX - Pass the RX budget to isc_rxd_available() to allow an earlier return and avoid multiple calls Submitted by: shurd Reviewed by: gallatin Approved by: scottl Differential Revision: https://reviews.freebsd.org/D7393
|
#
4c7070db |
|
17-May-2016 |
Scott Long <scottl@FreeBSD.org> |
Import the 'iflib' API library for network drivers. From the author: "iflib is a library to eliminate the need for frequently duplicated device independent logic propagated (poorly) across many network drivers." Participation is purely optional. The IFLIB kernel config option is provided for drivers that want to transition between legacy and iflib modes of operation. ixl and ixgbe driver conversions will be committed shortly. We hope to see participation from the Broadcom and maybe Chelsio drivers in the near future. Submitted by: mmacy@nextbsd.org Reviewed by: gallatin Differential Revision: D5211
|