History log of /openbsd-current/sys/dev/pci/if_em.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.83 16-Feb-2024 mglocker

Re-introduce TSO support after we've implemented fixes for the two reported
issues:

1. Unaligned memory access panic on sparc64 -> Made ether_extract_headers()
memory alignment safe.
2. em(4) watchdog timeouts in conjunction with ix(4)/vlan(4) -> Fixed
RX/LRO packet size calculation used for TSO tagging in ix(4).

Extensive testing done by bluhm@ on amd64 and sparc64 based on different
chips.
Testing done on Hrvoje Popovskis ix(4)/em(4)/vlan(4) setup from where the
issue 2 was reported.

OK bluhm@


# 1.82 28-Jan-2024 mglocker

Back out the TSO support diff, since we got issues reported for which
no solution could be found. Known issues at this point:

1. sparc64 panics, probably because of an alignment issue in struct
tcphdr { th_off }. A diff for potentially fixing the alignment issue
exists, but testing is pending.
2. Watchdogs reported on the I350 chip, which can't be reproduced on own
hardware.


# 1.81 31-Dec-2023 mglocker

Add TCP Segmentation Offload (TSO) support for em(4). Following chip-sets
are currently known to support TSO; 82575, 82576, 82580, I350, and I210.

Suggested by claudio@. Feedback and testing from many on tech@.

OK bluhm@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.80 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


# 1.79 14-Dec-2021 patrick

Implement support for selecting SGMII or SerDes mode depending on the
plugged-in SFP transceiver and for reading out transceiver information
via ifconfig(8). To read from the SFP, we need to let the card issue
I2C transfers. Additionally we need I2C to read/write to the PHY when
MDIO is not available. Depending on the SFP's supported media types
we can decide which mode to use.

This fixes hardware-initialization and link-up problems with some em(4)
Fiber NIC and SFP combinations.

Tested by dlg@ and been in snaps for quite a while
ok dlg@ jmatthew@


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.82 28-Jan-2024 mglocker

Back out the TSO support diff, since we got issues reported for which
no solution could be found. Known issues at this point:

1. sparc64 panics, probably because of an alignment issue in struct
tcphdr { th_off }. A diff for potentially fixing the alignment issue
exists, but testing is pending.
2. Watchdogs reported on the I350 chip, which can't be reproduced on own
hardware.


# 1.81 31-Dec-2023 mglocker

Add TCP Segmentation Offload (TSO) support for em(4). Following chip-sets
are currently known to support TSO; 82575, 82576, 82580, I350, and I210.

Suggested by claudio@. Feedback and testing from many on tech@.

OK bluhm@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.80 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


# 1.79 14-Dec-2021 patrick

Implement support for selecting SGMII or SerDes mode depending on the
plugged-in SFP transceiver and for reading out transceiver information
via ifconfig(8). To read from the SFP, we need to let the card issue
I2C transfers. Additionally we need I2C to read/write to the PHY when
MDIO is not available. Depending on the SFP's supported media types
we can decide which mode to use.

This fixes hardware-initialization and link-up problems with some em(4)
Fiber NIC and SFP combinations.

Tested by dlg@ and been in snaps for quite a while
ok dlg@ jmatthew@


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.81 31-Dec-2023 mglocker

Add TCP Segmentation Offload (TSO) support for em(4). Following chip-sets
are currently known to support TSO; 82575, 82576, 82580, I350, and I210.

Suggested by claudio@. Feedback and testing from many on tech@.

OK bluhm@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.80 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


# 1.79 14-Dec-2021 patrick

Implement support for selecting SGMII or SerDes mode depending on the
plugged-in SFP transceiver and for reading out transceiver information
via ifconfig(8). To read from the SFP, we need to let the card issue
I2C transfers. Additionally we need I2C to read/write to the PHY when
MDIO is not available. Depending on the SFP's supported media types
we can decide which mode to use.

This fixes hardware-initialization and link-up problems with some em(4)
Fiber NIC and SFP combinations.

Tested by dlg@ and been in snaps for quite a while
ok dlg@ jmatthew@


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.80 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


# 1.79 14-Dec-2021 patrick

Implement support for selecting SGMII or SerDes mode depending on the
plugged-in SFP transceiver and for reading out transceiver information
via ifconfig(8). To read from the SFP, we need to let the card issue
I2C transfers. Additionally we need I2C to read/write to the PHY when
MDIO is not available. Depending on the SFP's supported media types
we can decide which mode to use.

This fixes hardware-initialization and link-up problems with some em(4)
Fiber NIC and SFP combinations.

Tested by dlg@ and been in snaps for quite a while
ok dlg@ jmatthew@


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.79 14-Dec-2021 patrick

Implement support for selecting SGMII or SerDes mode depending on the
plugged-in SFP transceiver and for reading out transceiver information
via ifconfig(8). To read from the SFP, we need to let the card issue
I2C transfers. Additionally we need I2C to read/write to the PHY when
MDIO is not available. Depending on the SFP's supported media types
we can decide which mode to use.

This fixes hardware-initialization and link-up problems with some em(4)
Fiber NIC and SFP combinations.

Tested by dlg@ and been in snaps for quite a while
ok dlg@ jmatthew@


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.78 13-Jul-2020 dlg

add kstat support for reading hardware counters.

this replaces the existing counters implementation, which just
collected the stats in the softc, but didn't really provide a way
for a person to read them.

em counters get cleared on read. a lot of them are 32bit, so to
avoid overflow the counters are polled and the newly accumulated
values are added to some 64 bit counters in software.

tested by hrvoje popovski and SAITOH Masanobu
ok mpi@

i missed these files when i committed src/sys/dev/pci/if_em.c r1.356.
thanks to jsg for pointing this out.


Revision tags: OPENBSD_6_7_BASE
# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.77 22-Apr-2020 mpi

Use FOREACH_QUEUE() where nothing else is required to support multi-queues.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.76 23-Mar-2020 mpi

Make it possible to use em(4) with MSI-X, currently disabled by default.

The current implementation still uses a single queue but already establishes
a different handler for link interrupts. This is done in preparation for
multi-queues support.

Based on a bigger diff from haesbaert@ and on the FreeBSD code.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.75 20-Feb-2020 mpi

Introduce the concept of queue to prepare supporting multiple of them.

Move the tx/rx descriptors to dedicated structures similar to what already
exist in ix(4).

Only one queue is currently used, no real architectural change introduced
in this diff.

Extracted from a big diff from haesbaert@ via patrick@.

Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


# 1.74 01-Mar-2019 dlg

use a timeout to refill the rx ring when it's empty.

em had rxr, but didn't use a timeout cos it claimed to generate an
RX overflow interrupt when packets fell off slots in the ring. turns
out that's a lie on at least one chip, so add the timeout like other
drivers.

this was hit by mlarkin@, who had nfs and bufs steal all the packets
and memory for packets from em, which didn't recover after the
memory had been released back to the system.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.73 27-Oct-2016 dlg

tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.

this means that the ethernet header and therefore its payload will
be aligned correctly for the stack. without this em and ix are
sufferring a 30 to 40 percent hit in forwarding performance because
the ethernet stack expects to be able to prepend 8 bytes for an
ethernet header so it can gaurantee its alignment. because em and
ix only had 6 bytes where the ethernet header was, it always prepends
an mbuf which turns out to be expensive. this way the prepend will
be cheap because the 8 byte space will exist.

2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2
pool.

the regression was isolated and the fix tested by hrvoje popovski.
ok mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.72 18-Feb-2016 bluhm

Add support for the Intel i219 network chip to the em(4) driver.
from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@
tested by sthen@ jca@ benno@ bluhm@


# 1.71 11-Jan-2016 dlg

do further work on the em transmit path to simplify the code.

noone could understand how em_txeof worked, so i rewrote it.

this also gets rid of the sc_tx_desc_free var that needed atomic
ops. space to use in em_start and space to free in em_txeof is now
calculated from the producer and consumer.

testers have reported better responsiveness with this. somehow.
if em issues persist after this, im rolling back to pre-mpsafe changes.


# 1.70 07-Jan-2016 dlg

rename em_buffers to em_packets.

shorten a bunch of variable names while here.


# 1.69 07-Jan-2016 dlg

rename the rx and tx ring softc vars.


# 1.68 07-Jan-2016 dlg

prefix the rx and tx ring softc members with sc_


# 1.67 07-Jan-2016 dlg

dma_paddr in struct em_dma_alloc is unused, so gc it.


# 1.66 07-Jan-2016 dlg

unify the dma tag into sc_dmat in em_softc.


# 1.65 07-Jan-2016 dlg

sprinkle DEVNAME


# 1.64 07-Jan-2016 dlg

rename the struct arpcom interface_data in em_softc to sc_ac.

makes it more consistent with the rest of the tree.


# 1.63 07-Jan-2016 dlg

rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.


# 1.62 07-Jan-2016 dlg

tweak em to make it mpsafe, both for interrupts and if_start.

this is mostly work by kettenis and claudio, with further work from
me to make the transmit side from the stack mpsafe.

there's a watchdog issue that will be worked on in tree after this
change.

tested by hrvoje popovski and gregor best
ok mpi@ claudio@ deraadt@ jmatthew@


# 1.61 24-Nov-2015 mpi

You only need <net/if_dl.h> if you're using LLADDR() or a sockaddr_dl.


# 1.60 20-Nov-2015 mpi

Missed in previous, pointed by benoit@


# 1.59 14-Nov-2015 mpi

Do not include <net/if_vlan_var.h> when it's not necessary.

Because of the VLAN hacks in mpw(4) this file still contains the definition
of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull
<sys/atomic.h>...


# 1.58 30-Sep-2015 kettenis

Run the tx completion path without the kernel held. This makes the
"fast path" through the interrupt handler not grab the kernel lock anymore.
This removes the code that attempts to reclaim tx descriptors from em_start().
Keeping that code would have complicated the locking. The need to reclaim
tx descriptors that way should have largely disappeared now that the interrupt
handler doesn't have to wait on the kernel lock.

ok mpi@
tested by many


# 1.57 19-Sep-2015 kettenis

Avoid using a mutex in the rx completion path. Instead rely on
intr_barrier(9) to avoid having the interrupt handler touch the rx data
structures while we're brining down the interface. This actually reverts
many of the changes in rev. 1.300.

ok mikeb@


# 1.56 26-Aug-2015 kettenis

Get rid if em_align. This approach used to make sense, but now that the
hardware rx mtu always gets set to the maximum supported value we will hit
it for every received packet. Instead, use a larger mbuf cluster size on
strict alignment architectures such that we can always m_adj to make sure the
packets are properly aligned. This wastes some memory but simplifies things
considerably. Hopefully we can reduce the spillage in the near future by
taking advantage of recent improvements in the pool code.

ok mpi@, mikeb@, dlg@


# 1.55 21-Aug-2015 kettenis

Run the part of the interrupt handler that does rx completion without holding
the kernel lock.

ok mpi@, dlg@


Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
# 1.54 26-Dec-2014 tedu

unifdef INET. missed a few headers in previous rounds


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h>


# 1.52 10-Jul-2014 deraadt

remove most of the boolean_t infection outside uvm/ddb/pmap; ok jsg


# 1.51 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.50 07-Aug-2013 bluhm

Most network drivers include netinet/in_var.h, but apparently they
don't have to. Just remove these include lines.
Compiled on amd64 i386 sparc64; OK henning@ mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.49 16-Apr-2013 deraadt

spelling errors; Diego Casati


Revision tags: OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.48 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


Revision tags: OPENBSD_4_8_BASE
# 1.47 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.46 25-Nov-2009 dms

Add support for em(4) interfaces found on intel EP80579 SoC. The MAC part is
basicly 82545, but the PHY's are separated form the chip and they are accessed
through a special PCI device called GCU which has the MDIO interface. Since
there is no direct relationship between MAC and PHY, so for the moment they
are assigned to each other the way its done on Axiomtek NA-200, that was
danted to us by them.

This also adds a device driver for the GCU.

tested by me on Axiomtek board
reviewed by claudio@, kettenis@, deraadt@
'commit that as is' deraadt@


# 1.45 10-Aug-2009 deraadt

A few more simple cases of shutdown hooks which only call xxstop, when
we now know the interface has already been stopped


Revision tags: OPENBSD_4_6_BASE
# 1.44 05-Jun-2009 naddy

tidy up promiscuous mode and multicast handling; from Brad; ok sthen@


Revision tags: OPENBSD_4_5_BASE
# 1.43 15-Dec-2008 brad

revert 1.20 now that the new allocator is used to control the number of
RX buffers allocated.

ok dlg@


# 1.42 05-Dec-2008 brad

Garbage collect now unused field in the softc struct again.


# 1.41 03-Dec-2008 dlg

recommit the use of the new mbuf cluster allocator.

this starts em up with 4 mbufs on the rx ring, which will then grow as
usage demands. this also allows em to take advantage of the new livelock
mitigation code as well as freeing up a boatload of kernel memory.

this version of the diff makes sure we only ever post the last descriptor
we filled to the hardware, rather than the whole ring when bringing the
interface up. it has been tested by users who got panics with the previous
diff without trouble.


# 1.40 29-Nov-2008 sthen

revert 1.197 if_em.c, 1.38/1.39 if_em.h, requested by dlg, until a bug
reported on misc@ can be tracked down.

identical diff from jsing.


# 1.39 28-Nov-2008 brad

Garbage collect now unused field in the softc struct.


# 1.38 26-Nov-2008 dlg

rework the filling of the rx ring. this switches us to having the cluster
allocation limited by per ifp statistics, ie, we're not guaranteed to have
mbufs in every slot on the rx ring.

instead of filling the ring with 256 mbufs all the time (about 512KB of
kva) when em is brought up, we give it 4. as demand grows we increase the
number of mbufs allowed on the ring. i will bet most users wont go above
50ish mbufs, so we're saving them 400KB of kva.

tested by many, including one on sparc64
ok claudio@ deraadt@ henning@ krw@


Revision tags: OPENBSD_4_4_BASE
# 1.37 22-Jul-2008 martynas

more negotation -> negotiation; ok sthen@


Revision tags: OPENBSD_4_3_BASE
# 1.36 21-Oct-2007 brad

Allow for the adjustment of the number of RX descriptors
for the newer generations of em(4) chipsets independently
from the first two generations (82542/82543). The first
two generations have hardware errata limiting the upper
maximum to 256 descriptors. The number of RX descriptors
has not been adjusted yet.

ok beck@ henning@ dlg@


Revision tags: OPENBSD_4_2_BASE
# 1.35 30-May-2007 ckuethe

Move the knob for the interrupt throttling register next to the knobs for
the other interrupt moderation schemes.
ok beck drahn


Revision tags: OPENBSD_4_1_BASE
# 1.34 18-Nov-2006 brad

fix comments


# 1.33 17-Nov-2006 brad

Add a lower TX threshold value and use this when checking the number of
available TX descriptors in the case that em_encap() has tried to reclaim
descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>
Tested on amd64/i386/sparc64


# 1.32 14-Nov-2006 brad

Rework the transmit register handling. In em_encap() store the index of
the EOP descriptor in the first descriptor of the packet. In em_txeof()
search for the DD bit set only in the EOP descriptors, embedding the
cleanup of all packet's descriptors into the inner loop.

This change is important for future chips, where the DD bit is going
to be set only in the EOP descriptors.

From Jack Vogel@Intel

Tested by brad@, mk@, reyk@, Gabriel Kihlman <gk at stacken dot kth dot se>,
Johan Mson Lindman <tybollt at solace dot mh dot se>, Jason Dixon and a few
others.
Tested on i386/amd64/sparc64.


# 1.31 10-Nov-2006 brad

Pre-allocate the TX DMA maps intead of creating and destroying a DMA map
per packet sent.

Tested by brad@, ckuethe@, Gabriel Kihlman <gk at stacken dot kth dot se>
and Tim Wiess <tim at nop dot cx>.
Tested with amd64/i386/sparc64.

ok damien@


# 1.30 06-Nov-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.2.9). Adds support
for a few newer Intel PCIe boards, some code removal and cleaning
and a few bug fixes.

From: Jack Vogel@Intel

Tested by mk@ wilfried@ brad@ dlg@, Marc Winiger, Gabriel Kihlman,
Jason Dixon, Johan Mson Lindman, and a few other end users.

Tested with 82543, 82544, 82540, 82545, 82541, 82547, 82546 and 82573.


# 1.29 17-Sep-2006 brad

Overhaul RX path to recover from mbuf cluster allocation failure.
- Create a spare DMA map for RX handler to recover from
bus_dmamap_load() failure.
- Make sure to update status bit in RX descriptors even if we failed
to allocate a new buffer.
- Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors.

From yongari@FreeBSD
Tested by myself and a number of end-users on i386/amd64/sparc64


# 1.28 17-Sep-2006 brad

revert revision 1.131, the code in question was later found to not ensure
the proper alignment requirement for the VLAN layer on strict alignment
architectures. This would result in Jumbo's working fine as long as VLANs
were not in use. If VLANs were in use and a packet comes in with a size
of 2046 bytes or larger, it would be corrupted as it came up through the
VLAN layer. Also check the hw max frame size, instead of the MTU, so the
alignment fixup is done as appropriate.

Fixes PR 5185.
Tested by Rui DeSousa with macppc and myself with alpha/sparc64.


Revision tags: OPENBSD_4_0_BASE
# 1.27 04-Aug-2006 brad

branches: 1.27.2;
- merge em/ixgb_disable_promisc() into em/ixgb_set_promisc().
- rearrange interface flags ioctl handler.


# 1.26 07-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

The previous attempt at commiting this included an unrelated change
to how the I/O base address was being set and this was the cause of
the breakage.

From: Intel's web-site


# 1.25 05-Jul-2006 brad

revert back to the older driver as this causes some breakage.


# 1.24 03-Jul-2006 brad

Sync up to Intel's latest FreeBSD em driver (6.0.5). Adds support
for new chipset revisions embedded in the ESB2 and ICH8 core logic
chipsets.

From: Intel's web-site


# 1.23 05-Mar-2006 brad

Sprinkle some tabs and a little cleaning.


Revision tags: OPENBSD_3_9_BASE
# 1.22 22-Feb-2006 brad

For 82544 and newer chips increase the number of TX descriptors to 512.


# 1.21 10-Dec-2005 brad

add a shutdown function and register it with shutdownhook_establish().


# 1.20 10-Dec-2005 brad

remove a bit of unused code.

Pointed out by Andrey Matveev <evol at online dot ptt dot ru> through noticing
a missing splx which pointed out the fact that code is unused to me.


# 1.19 18-Nov-2005 brad

PCIX -> PCI-X in a few comments


# 1.18 13-Nov-2005 brad

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

From glebius FreeBSD

ok dlg@


# 1.17 24-Oct-2005 brad

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact.

From glebius FreeBSD

ok krw@ beck@


# 1.16 21-Oct-2005 brad

Remove unused global adapter linked list.

From FreeBSD


# 1.15 15-Oct-2005 brad

- put spl's right in the code and remove the macros
- remove splassert()'s
- remove empty bus_dma_tag_destroy macro from code and header


Revision tags: OPENBSD_3_8_BASE
# 1.14 16-Jul-2005 brad

move headers and remove some FreeBSD specific stuff.


# 1.13 16-Jul-2005 brad

fix support for interrupt mitigation.

ok nate@


# 1.12 02-Jul-2005 deraadt

sync


# 1.11 04-May-2005 brad

remove #ifdef __OpenBSD__


# 1.10 27-Mar-2005 brad

remove FreeBSD ifdef bloat.

ok krw@


Revision tags: OPENBSD_3_7_BASE
# 1.9 08-Dec-2004 markus

powerhook: em_init on resume


# 1.8 16-Nov-2004 brad

- Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
- Corrected TBI workaround.
- Corrected incorrect LED operation issues.

From FreeBSD

ok deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.7 18-Jun-2004 mcbride

On architectures which have strict alignment, shift the entire mbuf chain by
ETHER_ALIGN bytes when jumbo packets are enabled (mtu > ETHERMTU).

ok henric@ (slightly different diff)


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.6 19-May-2004 brad

remove duplication, use ETHER_ALIGN from if_ether.h


# 1.5 26-Apr-2004 deraadt

oh we need to model check and not crank > 256 for older cards... do that later


# 1.4 26-Apr-2004 deraadt

this driver had 256 clusters for receive buffers. move to 512, to increase
performance, if the interface is up. at boot time, allocate only 12 though
... though we note that em_stop() frees them all. perhaps some are used to
talk to other parts of the engine though at runtime... tested by mcbride and
beck


# 1.3 18-Apr-2004 henric

Sync with FreeBSD's "em".


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE
# 1.2 13-Jun-2003 henric

Sync with FreeBSD's "em".

ok deraadt@


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.1 24-Sep-2002 nate

branches: 1.1.4; 1.1.8;
Driver for Intel PRO/1000 gigabit ethernet adapters.
This driver should work with all current models of gigabit ethernet adapters.

Driver written by Intel
Ported from FreeBSD by Henric Jungheim <henric@attbi.com>
bus_dma and endian support by me.