History log of /freebsd-9.3-release/sys/dev/e1000/if_em.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 262153 18-Feb-2014 luigi

MFH: sync the netmap code with the one in HEAD
(enhanced VALE switch, netmap pipes, emulated netmap mode).
See details in the log for svn 261909.


# 257768 06-Nov-2013 luigi

Merge from head: sync the netmap code with the one in HEAD


# 254383 15-Aug-2013 jfv

MFC r254262 Further improve the msix setup, make sure pci_alloc_msix() gives us
the vectors we requested, and fall back to MSI when not, also release
any allocated resources before the fallback.


# 254382 15-Aug-2013 jfv

MFC r254008 Make the fallback from MSIX to MSI interrupt usage more graceful.


# 254306 13-Aug-2013 scottl

Merge r254263:

Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.

This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.

Candidate for 9.2

Submitted by: jhb
Reviewed by: jfv, marius, adrian, achim


# 253374 15-Jul-2013 jfv

MFC: r253284, r253285, r253303:

Correct the Intel network driver module builds. They were not
defining INET or INET6, and in the case of ixgbe this will cause
a panic in the TSO setup code, but in all cases the ioctl behavior
is different, this change makes the module and static consistent.

Approved by: re


# 252899 06-Jul-2013 jfv

MFC e1000 driver revisions: 248906,248908,249074,249339,249509
250108,250109,250168,250413,250414


# 250458 10-May-2013 luigi

MFC: sync the version of netmap with the one in HEAD, including device
drivers (mostly simplifying the code in the interrupt handlers).

On passing, also merge r250414, which is related to netmap
and the use of lem/em in virtual machines.


# 248292 14-Mar-2013 jfv

MFC of the E1000 drivers including revisions:
------------------------------------------------------------------------
r238765 | luigi | 2012-07-25 04:28:15 -0700 (Wed, 25 Jul 2012) | 7 lines
Use legacy interrupts as a default. This gives up to 10% speedup
when used in qemu (and this driver is for non-PCIe cards,
so probably its largest use is in virtualized environments).
------------------------------------------------------------------------
r238770 | luigi | 2012-07-25 05:51:33 -0700 (Wed, 25 Jul 2012) | 4 lines
remove some extra testing code that slipped into the previous commit
------------------------------------------------------------------------
r238953 | jfv | 2012-07-31 11:44:10 -0700 (Tue, 31 Jul 2012) | 6 lines
Clean up some unused leftover code from em
Make IRQ style a tuneable
Fix lock handling in the interrupt handler
------------------------------------------------------------------------
r238981 | sbruno | 2012-08-01 17:00:34 -0700 (Wed, 01 Aug 2012) | 9 lines
CPU_NEXT() already handles wrapping around to the beginning. Also, in a
system with sparse CPU IDs, you can have a valid CPU ID > mp_ncpus (e.g. if
you have two CPUs 0 and 4, with mp_maxid == 4 and mp_ncpus == 2).
------------------------------------------------------------------------
r239105 | jfv | 2012-08-06 13:44:05 -0700 (Mon, 06 Aug 2012) | 5 lines
Correct the mq_start routine to avoid out-of-order
packet delivery, always enqueue when possible. Also
correct the DEPLETED test as multiple bits might be
set. Thanks to Randall Stewart for the changes!
------------------------------------------------------------------------
r239109 | jfv | 2012-08-06 15:43:49 -0700 (Mon, 06 Aug 2012) | 6 lines
Make the polling interface in igb able to handle
multiqueue, and correct the rxdone handling. Update
the polling man page to include igb as well.
------------------------------------------------------------------------
r239304 | jfv | 2012-08-15 10:12:40 -0700 (Wed, 15 Aug 2012) | 10 lines
Customer report of a panic on boot due to the old
"m_getjcl:invalid cluster type" that occurred some
time back with the igb driver. This happens often when
booting over the net. I believe the NIC hardware is left
in a warm state when handed over to the driver, and a stray
RX interrupt happens earlier than the code is prepared for
it to happen. This change was verified to fix the problem,
its kind of a bandaid... but it is similar to what was done
in the igb code.
------------------------------------------------------------------------
r240693 | gavin | 2012-09-19 05:27:23 -0700 (Wed, 19 Sep 2012) | 5 lines
Switch some PCI register reads from using magic numbers to using the names
defined in pcireg.h
------------------------------------------------------------------------
r241856 | eadler | 2012-10-21 20:41:14 -0700 (Sun, 21 Oct 2012) | 7 lines
Now that device disabling is generic, remove extraneous code from the
device drivers that used to provide this feature.
------------------------------------------------------------------------
r241885 | eadler | 2012-10-22 06:06:09 -0700 (Mon, 22 Oct 2012) | 7 lines
This isn't functionally identical. In some cases a hint to disable
unit 0 would in fact disable all units. This reverts r241856
------------------------------------------------------------------------
r243570 | glebius | 2012-11-26 12:03:57 -0800 (Mon, 26 Nov 2012) | 14 lines
drbr_enqueue() awlays consumes mbuf, no matter did it
fail or not. The mbuf pointer is no longer valid, so
can't be reused after.
Fix igb_mq_start() where mbuf pointer was used after
drbr_enqueue().
This eventually leads us to all invocations of
igb_mq_start_locked() called with third argument as NULL.
This allows us to simplify this function.
------------------------------------------------------------------------
r245334 | smh | 2013-01-12 08:05:55 -0800 (Sat, 12 Jan 2013) | 9 lines
Fixed mbuf free when receive structures fail to allocate.
This prevents quad igb card on high core machines, without any nmbcluster or
igb queue tuning wedging the boot process if all nics are configured.
------------------------------------------------------------------------
r246128 | sbz | 2013-01-30 10:01:20 -0800 (Wed, 30 Jan 2013) | 5 lines
Use DEVMETHOD_END macro defined in sys/bus.h instead of {0, 0} sentinel on device_method_t arrays
------------------------------------------------------------------------
r246482 | rrs | 2013-02-07 07:20:54 -0800 (Thu, 07 Feb 2013) | 30 lines
This fixes a out-of-order problem with several
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.

The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.
------------------------------------------------------------------------
r247064 | jfv | 2013-02-20 16:25:45 -0800 (Wed, 20 Feb 2013) | 19 lines
Refresh on the shared code for the E1000 drivers.
- bear with me, there are lots of white space changes, I would not
do them, but I am a mere consumer of this stuff and if these drivers
are to stay in shape they need to be taken.

em driver changes: support for the new i217/i218 interfaces

igb driver changes:
- TX mq start has a quick turnaround to the stack
- Link/media handling improvement
- When link status changes happen the current flow control state
will now be displayed.
- A few white space/style changes.

lem driver changes:
- the shared code uncovered a bogus write to the RLPML register
(which does not exist in this hardware) in the vlan code,this
is removed.
------------------------------------------------------------------------


# 248078 08-Mar-2013 marius

MFC: r243857 (partial)

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.


# 243440 23-Nov-2012 glebius

Merge r241037 from head:
The drbr(9) API appeared to be so unclear, that most drivers in
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.

o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.

o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.

o Drivers handle their stats theirselves: if_obytes, if_omcasts.

o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.

o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.

o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.

o mxge(4) just no longer needs NO_SLOW_STATS define.

o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.

Reviewed by: jfv, gnn


# 242015 24-Oct-2012 gavin

Merge r240680 from head:

Align the PCI Express #defines with the style used for the PCI-X
#defines. This has the advantage that it makes the names more
compact, and also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

In this MFC, #defines have been added for the old names to assist
out-of-tree drivers.


# 241366 09-Oct-2012 sbruno

MFC r240879

This patch fixes a nit in the em, lem, and igb driver statistics. Increment
adapter->dropped_pkts instead of if_ierrors because if_ierrors is
overwritten by hw stats collection.

Submitted by: Andrew Boyer <aboyer@averesystems.com>
Reviewed by: Jack F Vogel <jfv@freebsd.org>


# 238262 08-Jul-2012 jfv

MFC of the e1000 drivers: 236406,238148,238151,238181, and 238214

Approved by:re


# 235527 16-May-2012 jfv

MFC of the e1000 drivers: revisions include
227309,228281,228386,228387,228393,228405,
228415,228788,228803,229606,229767,229939,
230023,230024,230742,231796,232238,233708,
234154,234665,235256


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 223676 29-Jun-2011 jhb

- Add read-only sysctls for all of the tunables supported by the igb and
em drivers.
- Make the per-instance 'enable_aim' sysctl truly per-instance by having it
change a per-instance variable (which is used to control AIM) rather
than having all of the per-instance sysctls operate on a single global
variable.

Reviewed by: jfv (earlier version)
MFC after: 1 week


# 221505 05-May-2011 jfv

Add an initialization to the error variable, without
this there is a rare return path that bogusly appears
to fail when it should not. Also white space correction.

Thanks to Arnaud Lacombe for noticing the problem.


# 220254 01-Apr-2011 jfv

Fix to an error condition case, when an mbuf chain
get's defragged due to a mapping failure the header
pointers will be invalidated and can result in a
TSO or other failure down the line. So, when the
remapping occurs force a retry thru the offload
calculation code. Thanks to Andrew Boyer for discovering
this and cooking up the fix!!


# 220251 01-Apr-2011 jfv

Change the refresh_mbuf logic slightly, add an inline
to calculate the outstanding descriptors that need to be
refreshed at any time, and use THAT in rxeof to determine
if refreshing needs to be done. Also change the local_timer
to simply fire off the appropriate interrupt rather than
schedule a tasklet, its simpler.

MFC in two weeks


# 219902 23-Mar-2011 jhb

Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.


# 219753 18-Mar-2011 jfv

This delta updates the em driver to version 7.2.2 which has
been undergoing test for some weeks. This improves the RX
mbuf handling to avoid system hang due to depletion. Thanks
to all those who have been testing the code, and to Beezar
Liu for the design changes.

Next the igb driver is updated for similar RX changes, but
also to add new features support for our upcoming i350 family
of adapters.

MFC after a week


# 217591 19-Jan-2011 jfv

Fix for kern/152853, pullup at the wrong point
is breaking UDP. Thanks to Petr Lampa for the
patch.


# 217556 18-Jan-2011 mdf

Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.


# 217318 12-Jan-2011 mdf

sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.

Commit the Intel drivers.


# 217295 11-Jan-2011 jfv

A couple problems discovered by Andrew Boyer:
- failure code in em_xmit got mangled along the way
and was not properly handling errors.
- local timer code had a leftover UNLOCK call that
should be removed.

MFC after 3 days


# 216176 04-Dec-2010 jfv

Correct build error.


# 216172 04-Dec-2010 jfv

Small cut and paste bug in flow control string fixed.
Second, correct the discard/refresh_mbufs code to behave
more like igb, there have been panics due to discards and
this should fix them.

MFC after: 3 days


# 215808 24-Nov-2010 jfv

The purpose of this change is to add a routine to
disable ASPM L0S and L1 LINK states on 82573, 82574,
and 82583. The theory is that this is behind certain
hangs being experienced by some customers.

Also included a small optimization in the rxeof routine
that was in my internal code.

Change the PBA size for pchlan, it was incorrect.

MFC after: 3 days


# 214646 01-Nov-2010 jfv

Sync the lem code up with the vlan and other fixes in em.
Delete a unneeded test from the beginning of em_xmit.
CRITICAL: shared code fix for 82574, a mutex might not be
released, this can cause hangs.


# 214441 27-Oct-2010 jfv

In the data setup code for doing offloads the
ip and tcp pointers were not reset after some
pullups. In practice this led to an NFS mount
failure when using UDP reported by Kevin Lo,
thanks Kevin. Fix from yongari, thank you!


# 214363 25-Oct-2010 jfv

Bug fix delta to the em driver:
- Chasin down bogus watchdogs has led to an improved
design to this handling, the hang decision takes
place in the tx cleanup, with only a simple report
check in local_timer. Our tests have shown no false
watchdogs with this code.
- VLAN fixes from jhb, the shadow vfta should be per
interface, but as global it was not. Thanks John.
- Bug fixes in the support for new PCH2 hardware.
- Thanks for all the help and feedback on the driver,
changes to lem with be coming shortly as well.


# 213234 27-Sep-2010 jfv

Update code from Intel:
- Sync shared code with Intel internal
- New client chipset support added
- em driver - fixes to 82574, limit queues to 1 but use MSIX
- em driver - large changes in TX checksum offload and tso
code, thanks to yongari.
- some small changes for watchdog issues.
- igb driver - local timer watchdog code was missing locking
this and a couple other watchdog related fixes.
- bug in rx discard found by Andrew Boyer, check for null pointer

MFC: a week


# 212902 20-Sep-2010 jhb

Tweak the stats exported by the e1000 drivers:
- Add a single sysctl procedure to all three drivers to read an arbitrary
register (the register is passed as arg2). Use it to replace existing
routines in igb(4) that used a separate routine for each register, and
to add support for missing stats in em(4) and lem(4).
- Move the 'rx_overruns' and 'watchdog_timeouts' stats out of the MAC stats
section as they are driver stats, not MAC counters.
- Simplify the code that creates per-queue stats in igb(4) to use a single
loop and remove duplicated code.
- Properly read all 64 bits of the 'good octets received/transmitted' in
em(4) and lem(4).
- Actually read the interrupt count registers in em(4), and drop the
'host to card' sysctl stats from em(4) as they are not implemented in
any of the hardware this driver supports.
- Restore several stats to em(4) that were lost in the earlier stats
conversion including per-queue stats.
- Export several MAC stats in em(4) that were exported in igb(4) but not
in em(4).
- Export stats in lem(4) using individual sysctls as in em(4) and igb(4).

Reviewed by: jfv
MFC after: 1 week


# 212304 07-Sep-2010 jfv

Code correction in refresh_mbufs, just continuing
without index recalc was wrong.


# 212303 07-Sep-2010 jfv

Tighten up the rx mbuf refresh code, there were some
discrepencies from the igb version which was the target.

Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.


# 211913 27-Aug-2010 yongari

Do not allocate multicast array memory in multicast filter
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.

To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.

Reviewed by: jfv


# 211909 27-Aug-2010 yongari

If em(4) failed to allocate RX buffers, do not call panic(9).
Just showing some buffer allocation error is more appropriate
action for drivers. This should fix occasional panic reported on
em(4) when driver encountered resource shortage.

Reviewed by: jfv


# 211907 27-Aug-2010 yongari

Do not call voluntary panic(9) in case of if_alloc() failure.

Reviewed by: jfv


# 209959 12-Jul-2010 jfv

Fix for a panic when TX checksum offload is done and
a packet has only a header in the first mbuf, the
checksum code will dereference a pointer into the
non-existing IP header. Do a check for the size and
pullup if needed. Thanks to Michael Tuexen for this
fix.

MFC: asap - should be in 8.1 IMHO


# 209259 17-Jun-2010 jfv

Two stats were duplicated, thanks to Andrew Boyer
for pointing this out.


# 209242 16-Jun-2010 gnn

Move statistics into the sysctl tree making it easier to find
and use them.
Add previously hidden statistics, some of which include interrupt
and host/card communication counters.


# 209238 16-Jun-2010 jfv

Changes from John Baldwin adding to last commit,
change rxeof api for poll friendliness, and
eliminate unnecessary link tasklet use. Thanks John!


# 208117 15-May-2010 marius

Fix a mismerge in r206001.

PR: 146614
Approved by: jfv (implicit)
MFC afer: 3 days


# 208103 14-May-2010 jfv

Small changes preparing for MFC, need to conditionalize
the buf_ring_free call, and lem is missing the WOL change
put into em.


# 207337 28-Apr-2010 jfv

Address the LOD that some are seeing, put the RX lock
back in rxeof (I could see little point in taking it out),
and now release it before the stack entry.

Also, make it so the 82574 does not configure for multiqueue
when its not used in the stack.


# 207331 28-Apr-2010 jfv

Change default WOL back to MAGIC only, having
multicast enabled causes problems in man environments.


# 206629 14-Apr-2010 jfv

Add a missing fragment in the tx msix handler to invoke
another if all work is not done.

Sync the igb driver with changes suggested by yongari and
made in em, these made sense to be in both drivers.


# 206460 10-Apr-2010 jfv

The lock move in rxeof necessitated a couple
more places to do the locking, fixes a panic.


# 206447 10-Apr-2010 jfv

Correct broken build.


# 206437 09-Apr-2010 jfv

A few more changes from yongari:
- code flow in handler could let interrupt be
reenabled when not wanted.
- change where the RX lock is taken to improve
performance.
- adapter->msix is true for MSI systems also,
it needs to explicitly test for 82574, good one :)


# 206429 09-Apr-2010 jfv

Incorporate suggested improvements from yongari.

Also, from feedback, make the multiqueue code an
option (EM_MULTIQUEUE) that is off by default.
Problems have been seen with UDP when its on.


# 206403 08-Apr-2010 jfv

Three changes:
- add CRC stripping to the RX side, this was handled
by some obscure code in rxeof previously, its easier
to simply have the hardware strip it now.
- Add back an ALTQ change that slipped between the cracks
- Add an update to the watchdog_time in the xmit code, not
doing this in ixgbe caused problems, think its needed here
as well.


# 206388 07-Apr-2010 jfv

Important fix got clobbered in the em driver, keeping
VLAN HWFILTER from being used by default, this breaks
stacked pseudo devices, and as it turns out, also breaks
virtual machines that happen to use VLANS (didn't know that
before :). Put the fix back into the em driver, and for good
measure add the same code to the igb driver where it should
have been anyway.


# 206001 31-Mar-2010 marius

Hook the identification LEDs of igb(4), lem(4) and em(4) devices up with
led(4) so they can be lit or f.e. made blink via `echo f2 > /dev/led/em0`
for localization purposes.

Approved by: jfv
MFC afer: 1 week (after r205869)


# 205884 30-Mar-2010 jfv

Fix lint build problem.


# 205869 29-Mar-2010 jfv

Update to igb and em:

em revision 7.0.0:
- Using driver devclass, seperate legacy (pre-pcie) code
into a seperate source file. This will at least help
protect against regression issues. It compiles along
with em, and is transparent to end use, devices in each
appear to be 'emX'. When using em in a modular form this
also allows the legacy stuff to be defined out.
- Add tx and rx rings as in igb, in the 82574 this becomes
actual multiqueue for the first time (2 queues) while in
other PCIE adapters its just make code cleaner.
- Add RX mbuf handling logic that matches igb, this will
eliminate packet drops due to temporary mbuf shortage.

igb revision 1.9.3:
- Following the ixgbe code, use a new approach in what
was called 'get_buf', the routine now has been made
independent of rxeof, it now does the update to the
engine TDT register, this design allows temporary
mbuf resources to become non-critical, not requiring
a packet to be discarded, instead it just returns and
does not increment the tail pointer.
- With the above change it was also unnecessary to keep
'spare' maps around, since we do not have the discard
issue.
- Performance tweaks and improvements to the code also.

MFC in a week


# 203834 13-Feb-2010 mlaier

Fix drbr and altq interaction:
- introduce drbr_needs_enqueue that returns whether the interface/br needs
an enqueue operation: returns true if altq is enabled or there are
already packets in the ring (as we need to maintain packet order)
- update all drbr consumers
- fix drbr_flush
- avoid using the driver queue (IFQ_DRV_*) in the altq case as the
multiqueue consumer does not provide enough protection, serialize altq
interaction with the main queue lock
- make drbr_dequeue_cond work with altq

Discussed with: kmacy, yongari, jfv
MFC after: 4 weeks


# 203354 01-Feb-2010 jfv

A few minor changes: add altq option header, add missing conditional
around a buf_ring call that will break 7.3, and thanks to Fabien Thomas
add POLLING support for igb and a minor related fix in the em driver.


# 203179 29-Jan-2010 jfv

Fix for kern/141646: when stacking pseudo drivers like
lagg and vlan the vlan attach/detach event is not being
handed down to em, this caused some init code not to run,
and thus VLANs did not work. Ultimately having the event
get propagated would be nice, but for now the solution is
to have HWFILTER off by default, when this is the case
VLANs will work, ifconfig can be used to turn it on and
then get HW tag filtering.


# 203051 26-Jan-2010 jfv

Missing a fix for the new watchdog handling.


# 203049 26-Jan-2010 jfv

Update the 1G drivers, shared code sync with Intel,
igb now has a queue notion that has a single interrupt
with an RX/TX pair, this will reduce the total interrupts
seen on a system. Both em and igb have a new watchdog
method. igb has fixes from Pyun Yong-Hyeon that have
improved stability, thank you :)

I wish to MFC this for 7.3 asap, please test if able.


# 201758 07-Jan-2010 mbr

Remove extraneous semicolons, no functional changes.

Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week


# 200243 07-Dec-2009 jfv

Resync with Intel versions of both the em and igb
drivers. These add new hardware support, most importantly
the pch (i5 chipset) in the em driver. Also, both drivers
now have the simplified (and I hope improved) watchdog
code. The igb driver uses the new RX cleanup that I
first implemented in ixgbe.

em - version 6.9.24
igb - version 1.8.4


# 197078 10-Sep-2009 jfv

Fix build complaint from previous checkin


# 197073 10-Sep-2009 jfv

Fix for pr 138516
An mbuf is not requeued when a xmit fails.

MFC: 3 days


# 196970 08-Sep-2009 phk

Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.


# 196969 08-Sep-2009 phk

Add necessary include.


# 196386 19-Aug-2009 delphij

Temporarily enhance em(4) and igb(4) hack to take account for IFF_NOARP.
Without this changeset there will be no way to prevent these NICs from
sending ARP, which is harmful in server farms that is configured as
"Direct Server Return" behind a load balancer.

A better fix would remove the whole hack completely but it would be
later than 8.0-RELEASE.

Reviewed by: jfv, yongari
Approved by: re (kib)


# 195857 24-Jul-2009 jfv

Improvement on the last change, this gives a precise
way to tell the one and only interface that a vlan
event is for. Thanks to John Baldwin for the patch.

Approved by: re


# 195851 24-Jul-2009 jfv

This delta fixes two bugs:
- When a vlan event occurs a check was not made that
the event was actually for the interface, thus resulting
in a panic. All three drivers have this vulnerability. Add
a check for this condition.
- Secondly, there was a duplicate buf_ring free in the em
driver resulting in a panic on unload. Remove.

Approved by: re


# 195168 29-Jun-2009 jfv

Type problem when FreeBSD is in a virtualized environment, the
result was when the RX index wrapped it was converted into some
sort of gibberish and written into the RDT register, effectively
killing the RX side of the thing :)

Approved by: re


# 195049 26-Jun-2009 rwatson

Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs. This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by: re (kib)
MFC after: 6 weeks


# 194865 24-Jun-2009 jfv

Updates for both the em and igb drivers, add support
for multiqueue tx, shared code updates, new device
support, and some bug fixes.


# 193096 30-May-2009 attilio

When user_frac in the polling subsystem is low it is going to busy the
CPU for too long period than necessary. Additively, interfaces are kept
polled (in the tick) even if no more packets are available.
In order to avoid such situations a new generic mechanism can be
implemented in proactive way, keeping track of the time spent on any
packet and fragmenting the time for any tick, stopping the processing
as soon as possible.

In order to implement such mechanism, the polling handler needs to
change, returning the number of packets processed.
While the intended logic is not part of this patch, the polling KPI is
broken by this commit, adding an int return value and the new flag
IFCAP_POLLING_NOCOUNT (which will signal that the return value is
meaningless for the installed handler and checking should be skipped).

Bump __FreeBSD_version in order to signal such situation.

Reviewed by: emaste
Sponsored by: Sandvine Incorporated


# 192081 14-May-2009 kmacy

Call drbr_stats_update to update ifp stats directly when we bypass the buf_ring on transmit


# 191612 27-Apr-2009 kmacy

fix typo in conditional


# 191611 27-Apr-2009 kmacy

collapse the two em_start_locked routines in to one


# 191580 27-Apr-2009 jfv

Correct fat finger mistake


# 191566 27-Apr-2009 jfv

Thanks for Michael Tuexen for tracking down a path where
the watchdog timer was not being rearmed in txeof, and also
a missing case in the new code.

MFC after: 2 weeks


# 191442 23-Apr-2009 kmacy

fix typo


# 191441 23-Apr-2009 kmacy

fix panic when using msix

Pointed out by Nate Whitehorn


# 191440 23-Apr-2009 kmacy

Make sure the ALTQ case is handle correctly by using drbr_dequeue


# 191162 16-Apr-2009 kmacy

call base if_qflush routine to flush if_snd


# 191038 14-Apr-2009 kmacy

- define em_transmit and em_qflush
- make buF_ring usage conditional but enabled by default

Reviewed by: jfv


# 190872 09-Apr-2009 jfv

This delta syncs the em and igb drivers with Intel,
adds header split and SCTP support into the igb driver.
Various small improvements and fixes.

MFC after: 2 weeks


# 185748 07-Dec-2008 thompsa

Restore opt_inet.h include which was lost in the last commit.


# 185353 26-Nov-2008 jfv

This delta is primarily a fix for es2lan devices that
will sometimes fail to initialize problem due to a lock
contention with management hardware. However, in order to
deliver that fix it was necessary to take a shared code
update as a whole, and this required scattered changes in
the core code to be compatible.

The em driver now has VLAN HW support added as the igb
driver had previously.

MFC after: ASAP - in time for 7.1 RELEASE


# 184717 06-Nov-2008 bz

Hide AF_INET specific ioctl handling under #ifdef INET.

MFC after: 2 months


# 181027 30-Jul-2008 jfv

Merge of the source for igb and em into dev/e1000, this
proved to be necessary to make the static drivers work
in EITHER/OR or BOTH configurations. Modules will still
build in sys/modules/igb or em as before.

This also updates the igb driver for support for the 82576
adapter, adds shared code fixes, and etc....

MFC after: ASAP


# 179181 21-May-2008 jfv

Thanks to report from Neil Hoggarth I found a missing UNLOCK in
the watchdog code. This delta also incorporates some missing PCI
IDs that got added.

PR 122928 - might be fixed by this, no verification at this point.


# 179136 19-May-2008 jfv

This small change will allow this driver in HEAD to build
on 6.3 as well as 7 :)


# 178523 25-Apr-2008 jfv

This delta has a few important items:

PR 122839 is fixed in both em and in igb

Second, the issue on building modules since the static kernel
build changes is now resolved. I was not able to get the fancier
directory hierarchy working, but this works, both em and igb
build as modules now.

Third, there is now support in em for two new NICs, Hartwell
(or 82574) is a low cost PCIE dual port adapter that has MSIX,
for this release it uses 3 vectors only, RX, TX, and LINK. In
the next release I will add a second TX and RX queue. Also, there
is support here for ICH10, the followon to ICH9. Both of these are
early releases, general availability will follow soon.

Fourth: On Hartwell and ICH10 we now have IEEE 1588 PTP support,
I have implemented this in a provisional way so that early adopters
may try and comment on the functionality. The IOCTL structure may
change. This feature is off by default, you need to edit the Makefile
and add the EM_TIMESYNC define to get the code.

Enjoy all!!


# 177867 02-Apr-2008 jfv

This update primarily addresses the ability to have both the em
and the igb driver static in the kernel. But it also reflects
some other bug fixes in my development stream at Intel.
PR 122373 is also fixed in this code.


# 176667 29-Feb-2008 jfv

This change introduces a split to the Intel E1000 driver, now rather than
just em, there is an igb driver (this follows behavior with our Linux drivers).
All adapters up to the 82575 are supported in em, and new client/desktop support
will continue to be in that adapter.

The igb driver is for new server NICs like the 82575 and its followons.
Advanced features for virtualization and performance will be in this driver.

Also, both drivers now have shared code that is up to the latest we have
released. Some stylistic changes as well.

Enjoy :)


# 174060 28-Nov-2007 jfv

Add COHERENT to descriptor mem allocation for the
benefit of ARM (request from Olivier Houchard), its
a noop on most architectures and goodness on those
that use it.


# 173952 26-Nov-2007 jfv

Fix for a reported panic in certain circumstances. When
calling em_stop() now make sure the TX lock is held as
well as CORE.


# 173820 21-Nov-2007 ru

Take out em_poll() prototype from under EM_FAST_IRQ control.

Reported by: tindebox compiling a LINT kernel


# 173789 20-Nov-2007 jfv

One nit, FAST handling is now in #ifdef's for compatibility
between RELEASES, but we want it on by default in 7 and later,
add that define, and take out a fragment left from a workaround
being removed.


# 173788 20-Nov-2007 jfv

Driver version 6.7.3
- Bring HEAD up to the latest shared code
- Fix TSO problem using limited MSS and forwarding
- Dual lock implementation
- New device support
- For my ease, this code can compile in either 6.x or later
- brings this driver in sync with the 6.3


# 172138 10-Sep-2007 jfv

A number of small fixes:
- duplicate #define in header, thanks to Kevin Lo for pointing out.
- incorrect BUSMASTER enable logic, thanks Patrick Oeschger
- 82543 fails due to bogus IO BAR logic
- Allow 82571 to use MSI interrupts
- Checksum Offload for UDP not working on 82575

Approved by:re


# 171744 06-Aug-2007 rwatson

Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet. As that
has now been removed, they are no longer required. Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option. Clean up some related gotos for
consistency.

Reviewed by: bz, csjp
Tested by: kris
Approved by: re (kensmith)


# 171624 27-Jul-2007 cognet

Use coherent mapping for DMA on arm. This is propably suitable for the
other archs, but I can't test it so I made it conditionnal on __arm__
for now.

Approved by: re (blanket)


# 170171 31-May-2007 jfv

Couple of the fixes needed revising. The ICH8 autoneg was still broken,
this change both simplifies the code and plugs a hole where the devise
was reset without keeping the management controller at bay :) Second,
the 82571 LAA reset problem was incomplete, this addition is necessary.
Just one of those days :)


# 170141 30-May-2007 jfv

A few small but significant fixes:
- Coverity Prevent(tm) CID 1906 a bogus use of bzero where unneeded.
- ICH8 systems autoneg to 100 rather than 1000, this can also be
seen in 82573, the logic was backwards.
- On new 82575 quadports half duplex tx speed is slow... this was due
to overwriting TCTL reg rather than adding bits.


# 169955 24-May-2007 jfv

Fix for PR 112937, thanks to Ruslan Ermilov. I am still
a bit confused how the 'link flap' was connected to the
'get' rather than 'set' address, but this seems the right
thing to do here.


# 169918 23-May-2007 jfv

Two minor fixes, keep old 82542 from using jumbo frames, and add
missing htole64 in encap code.

Reviewed by:Pdeuskar
Approved by:Pdeuskar


# 169637 16-May-2007 jfv

Couple of changes, back down on last TSO change, instead make old
adapter list still capable, but only PCI-E adapters are now enabled.
The user can enable older PCI-X or PCI adapters using ifconfig.
Secondly, Arthur Hartwig pointed out my MSI change was not working
correctly, changed to something that now does. Thanks Arthur.
There was also a fundamental bug in the 82575 MSIX code, the MSIX
registers had to be mapped, opps :)

Rubber-stamped by: Pdeuskar


# 169589 15-May-2007 jfv

This delta adds two bug fixes: one that makes HW Offload logic in
legacy codepath match the 82575, without this we were seeing bridging
fail on 82546 adapters. Secondly, I have limited TSO to PCI Express
adapters, I meant to do this and it got dropped in the earlier delta.
Next, I am dropping in the latest shared code from our development
team, consensus was that this should be done frequently, so I am :)

Approved by: pdeuskar


# 169483 11-May-2007 jfv

Mistake in the logic deciding what adapters need
to map the IO BAR. Causing the driver to fail on
th 82542.

Reviewed by:pdeuskar
Approved by:pdeuskar


# 169397 08-May-2007 jfv

A couple bug fixes that I've had internally at Intel. First is a long
time workaround for problems with 82571 adapters and LAAs, one port
getting reset can cause the other to have its RAR[0] also reset,
thus overwriting an LAA. This fix works around it by also keeping
the address in the last array member.

The other bug is specific to the new 575 adapter, its transmit code
logic in handling hwassists was too crude, it broken when doing
bridges. I am much happier with the new logic,we may want to change
the legacy path at some point to something similar.

Reviewed by: pdeuskar
Approved by: pdeuskar


# 169248 04-May-2007 rwatson

$FreeBSD$ tags are not compilable C code; wrap in either __FBSDID() or
in comments for .c and .h files respectively. Jack may want to clean up
style or other aspects once he's up and about again, but this gets the
kernel compiling.


# 169240 03-May-2007 jfv

Merge in the new driver (6.5.0) of Intel. This has a new
shared code infrastructure that is family specific and
modular. There is also support for our latest gigabit
nic, the 82575 that is MSI/X and multiqueue capable.

The new shared code changes some interfaces to the core
code but testing at Intel has been going on for months,
it is fairly stable.

I have attempted to be careful in retaining any fixes that
CURRENT had and we did not, I apologize in advance if any
thing gets clobbered, I'm sure I'll hear about it :)

Approved by pdeuskar


# 167098 28-Feb-2007 ru

Revert previous change and take back a pointy hat.


# 167096 28-Feb-2007 ru

Fix panic on boot caused by setting up a NULL interrupt handler.

Submitted by: Goran Gajic
Pointy hat to: piso


# 166901 23-Feb-2007 piso

o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@


# 164547 23-Nov-2006 kmacy

remove no longer correct comment above em_read_pcie_cap_reg


# 164546 23-Nov-2006 kmacy

Move magic PCIe workaround constant to header - add appropriate comment

Suggested by: jfvogel


# 164534 22-Nov-2006 kmacy

Fix TSO support on sun4v

- incorporate csjp's fix for a mishandled endian conversion
- convert PAGE_SIZE to 4096 for PCIe adapter workaround (my page size is not 4k)
- implement em_read_pcie_cap_reg where we set the max read size on pcie to 4k (taken from mxge)

Reviewed by: scottl and jfvogel


# 164397 18-Nov-2006 csjp

Implement new ETHER_BPF_MTAP macro. Roll back the various changes
made to accommodate the chip being in promiscuous mode while
offloading VLAN tag processing to the hardware. We can now
properly handle the absence of VLAN tags from hardware stripping.

Reviewed by: rwatson, andre
MFC after: 1 month


# 164305 15-Nov-2006 jhb

Add MSI support to em(4), bce(4), and mpt(4). For now, we only support
devices that support a maximum of 1 message, and we use that 1 message
instead of the INTx rid 0 IRQ with the same interrupt handler, etc.


# 164126 09-Nov-2006 glebius

Instead of using the legacy if_timer/if_watchdog interface create
our own watchdog that piggybacks on the em_local_timer() routine.

We suppose that the if_timer/if_watchdog interface should be
obsoleted, since it doesn't fit the modern SMP network stack.
NIC drivers should create their own watchdogs, that check and
clear the timers always holding driver's lock.

In collaboration with: jfv, scottl


# 163881 01-Nov-2006 jhb

Fix compile botch in the last panic botch fix. :(

Pointy hat: jhb
Reported by: brueffer


# 163876 01-Nov-2006 jhb

Fix botch in last commit (I tested on 6.x which doesn't have TSO):
- Test the mac_type rather than if_hwassist (since ifp doesn't exist yet)
to determine if the adapter supports TSO and thus to change the sizes
for the bus_dma tag.

Reviewed by: glebius


# 163828 31-Oct-2006 jhb

Allocate receive and transmit data structures during attach() and free them
during detach() similar to other NIC drivers rather than allocating them
during init() and freeing them during stop():
- Move creation of tx bus_dma tag amd maps and tx_buffer_area from
em_setup_transmit_structures() to em_allocate_transmit_structures().
- Call em_allocate_xxx_structures() in em_attach().
- Only call em_free_xxx_structures() in em_detach().
- Change em_setup_xxx_structures() to free any existing tx or rx buffers
and in the case of rx repopulate the ring with newer buffers.

Reviewed by: jfv


# 163827 31-Oct-2006 jhb

- Use callout_init_mtx() to close various callout-related races.
- Drain the two timers in detach.
- Check IFF_DRV_RUNNING in the link task and bail w/o doing anything if
it is clear.

Reviewed by: jfv, scottl


# 163826 31-Oct-2006 glebius

Rework the transmit register handling. In em_encap() store index of
the EOP descriptor in the first descriptor of the packet. And then
in em_txeof() search for DD bits set only in the EOP descriptors,
embedding the cleanup of all packet's descriptors into inner loop.

This change is important for future chips, where DD bit is going
to be set only on the EOP descriptors.

Submitted by: jfv


# 163824 31-Oct-2006 glebius

Merge new vendor release - 6.2.9.

Details:
o if_em.c changes:
- Added several new PCI ids.
- Check em_check_phy_reset_block() before doing SIOCSIFMEDIA ioctl.
- Don't touch TARC registers, they are now handled in shared
code in if_em_hw.c.
- Move RDH and RDT setting to the end of
em_initialize_receive_unit().
- Declare em_read_pcie_cap_reg(), now empty.
o if_em_hw.c dropped in from vendor, then restored rev. 1.15.
o if_em_hw.h dropped in from vendor, then modified:
- Added RX overrun interrupt flag to interrupt enable mask.
- Remove declarations of em_io_read(), em_io_write().

Approved by: jfv


# 163730 28-Oct-2006 jfv

Backout bogus checkin to HEAD
Approved by: scottl


# 163724 27-Oct-2006 jfv

This is the merge of the Intel 6.2.9 driver. It provides all new shared code,
new device support, and it is hoped a more stable driver for 6.2. RELEASE.
This checkin was discussed and approved today by RE, scottl, jhb, and pdeuskar


# 162819 29-Sep-2006 andre

Back out rev. 1.152 as it was breaking vlan tag insertion when vlan tag
stripping was disabled due to being in promisc mode. This is a hardware
bug. Update comment to explicitly state the reason the manual vlan tag
insertion in this case. See rev. 1.53 for further information as well.

Noticed by: jhb


# 162790 29-Sep-2006 andre

Small style and comment adjustments.

Reviewed by: jfv


# 162789 29-Sep-2006 andre

Remove manual vlan header insertion in em_encap(). It is unnecessary as the
generic vlan_start() takes care of it when vlan hardware insertion is disabled.

In em_set_promisc() add a note that BPF may also be enabled without going into
promisc mode.

Reviewed by: jfv


# 162785 29-Sep-2006 andre

Change em_transmit_checksum_setup() to deal with already inserted vlan headers,
IP options and add skeleton IPv6 support. The new code structure can also be
easily enhanced to support new/more protocols (SCTP) in the future.

Reviewed by: jfv


# 162784 29-Sep-2006 andre

Change em_tso_setup() to deal with already inserted vlan headers, IP options
and add skeleton IPv6 support. The new code structure can also be easily
enhanced to support new/more protocols (SCTP) and IP fragmentation in the
future.

In em_encap() only try to do TSO if 'dotso' is true.

Reviewed by: jfv


# 162783 29-Sep-2006 andre

Only advertize IFCAP_TSO4 capabilities. IPv6 is not yet supported.

Reviewed by: jfv


# 162782 29-Sep-2006 andre

Handle all error cases from bus_dmamap_load_mbuf_sg(). Those are:

- EFBIG means the mbuf chain was too long and bus_dma ran out of segments.
Defragment the mbuf chain and try again. (Already existed, not changed.)
- ENOMEM means bus_dma could not obtain enough bounce buffers at this point
in time. Defer sending and try again later.
- All other errors, in particular EINVAL, are fatal and prevent the mbuf
chain from ever going through. Drop it and report error.
- Checking (nsegs == 0) is unnecessary as bus_dmamap_load_mbuf_sg() always
reports an error if it is < 1.

This prevents broken packets from clogging the interface queue indefinately.

Discussed with: scottl
Reviewed by: jfv


# 162532 21-Sep-2006 andre

Move the initialization of the hardware capabilities in em_init_locked()
before em_setup_transmit_structures() as it needs this information to
properly set up TSO parameters.

Reviewed by: jfv


# 162425 18-Sep-2006 andre

Don't forget to add curly braces when doing more than one line of actions
after a 'if' statement.

Pointy hat to: andre


# 162375 17-Sep-2006 andre

Move ethernet VLAN tags from mtags to its own mbuf packet header field
m_pkthdr.ether_vlan. The presence of the M_VLANTAG flag on the mbuf
signifies the presence and validity of its content.

Drivers that support hardware VLAN tag stripping fill in the received
VLAN tag (containing both vlan and priority information) into the
ether_vtag mbuf packet header field:

m->m_pkthdr.ether_vtag = vlan_id; /* ntohs()? */
m->m_flags |= M_VLANTAG;

to mark the packet m with the specified VLAN tag.

On output the driver should check the mbuf for the M_VLANTAG flag to
see if a VLAN tag is present and valid:

if (m->m_flags & M_VLANTAG) {
... = m->m_pkthdr.ether_vtag; /* htons()? */
... pass tag to hardware ...
}

VLAN tags are stored in host byte order. Byte swapping may be necessary.

(Note: This driver conversion was mechanic and did not add or remove any
byte swapping in the drivers.)

Remove zone_mtag_vlan UMA zone and MTAG_VLAN definition. No more tag
memory allocation have to be done.

Reviewed by: thompsa, yar
Sponsored by: TCP/IP Optimization Fundraise 2005


# 162235 11-Sep-2006 pdeuskar

Fix issues found by Coverity (223392, 223393) due to TSO additions

Submitted by: Matthew Jacob


# 162206 10-Sep-2006 pdeuskar

Fix style(9) issues in the TSO specific changes.

Pointed out by: jmallett


# 162187 09-Sep-2006 pdeuskar

Second attempt at fixing module build

Pointyhat: pdeuskar


# 162186 09-Sep-2006 pdeuskar

Fix build breakage while compiling em as a module.


# 162171 09-Sep-2006 pdeuskar

Add support for TSO. Thanks to Andre for adding support in the stack
and Jack Vogel for driver changes.

Submitted by: Jack Vogel


# 161928 02-Sep-2006 jmg

add a newbus method for obtaining the bus's bus_dma_tag_t... This is
required by arches like sparc64 (not yet implemented) and sun4v where there
are seperate IOMMU's for each PCI bus... For all other arches, it will
end up returning NULL, which makes it a no-op...

Convert a few drivers (the ones we've been working w/ on sun4v) to the
new convection... Eventually all drivers will need to replace the parent
tag of NULL, w/ bus_get_dma_tag(dev), though dev is usually different for
each driver, and will require hand inspection...

Reviewed by: scottl (earlier version)


# 161823 01-Sep-2006 jhb

Comment tweaks.


# 161822 01-Sep-2006 jhb

- Use pci_enable_busmaster() and pci_enable_io() to update the command
register. This really shouldn't be using pci_enable_io() directly as
bus_alloc_resource() does it already, but the cached copy of the
command word needs to be correct so the enable/disable mwi functions
work properly.
- Use pci bus accessors to read revision ID and subvendor IDs.

Reviewed by: jvogel


# 161821 01-Sep-2006 jhb

Add locking to the ifmedia callouts.

Reviewed by: jvogel, yongari


# 161810 01-Sep-2006 glebius

Fix my error in rev. 1.109.

Submitted by: jhb
Pointy hat to: glebius


# 161777 31-Aug-2006 jhb

Compare the correct field against NULL when determining whether or not to
do bus_teardown_intr().


# 161521 22-Aug-2006 yongari

It seems that em(4) misses Tx completion interrupts under certain
conditions. The cause of missing Tx completion interrupts comes from
Tx interrupt moderation mechanism(delayed interrupts) or chipset bug.
If Tx interrupt moderation mechanism is the cause of false watchdog
timeout error we should have to fix all device drivers that have Tx
interrupt moderation capability. We may need more investigation
for this issue. Anyway, the fix is the same for both cases.

This should fix occasional watchdog timeout errors seen on a few
systems.

Reported by: -net, Patrick M. Hausen < hausen AT punkt DOT de >
Tested by: Patrick M. Hausen < hausen AT punkt DOT de >


# 161372 16-Aug-2006 yongari

Don't update Rx descriptor status in two different functions.

Suggested by: pdeuskar
Reviewed by: pdeuskar


# 161278 14-Aug-2006 glebius

Change hardcoded and incorrect number with correct define. This change is a
nop, since E1000_FDX_COLLISION_DISTANCE == E1000_HDX_COLLISION_DISTANCE.

PR: kern/101000
Submitted by: Doug Havir


# 161267 14-Aug-2006 yongari

Make em(4) handle too many fragmented frame with m_defrag(9).
Previously em(4) requeued the failed mbuf chains from
bus_dmamap_load_mbuf_sg(9) failure to resend it later. However,
bus_dmamap_load_mbuf_sg(9) may never complete its request as the
fragmented frames can have more than EM_MAX_SCATTER segments.
To handle the above EFBIG case, defragment the frame with m_defrag(9)
and free the mbuf chain if it can't deframent the chain due to
resource shortage.

Reviewed by glebius (with improvements)


# 161266 13-Aug-2006 yongari

Overhaul Rx path to recover from mbuf cluster allocation failure.
o Create one more spare DMA map for Rx handler to recover from
bus_dmamap_load_mbuf_sg(9) failure.
o Make sure to update status bit in Rx descriptors even if we failed
to allocate a new buffer. Previously it resulted in stuck condition
and em_handle_rxtx task took up all available CPU cycles.
o Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors. This would speed up Rx processing a bit under
heavy load as it does not need to reload DMA map in case of error.
(bus_dmamap_load_mbuf_sg(9) is the most expensive call in driver
context.)
o Update if_iqdrops counter if it can't allocate a mbuf cluster.
With this change it's now possible to see queue dropped packets
with netstat(1).
o Update mbuf_cluster_failed counter if fixup code failed to
allocate mbuf header.
o Return ENOBUFS instead of ENOMEM in case of Rx fixup failure.
o Make adapter->lmp NULL in case of Rx fixup failure. Strictly
specking it's not necessary for correct operation but it makes
the intention clear.
o Remove now unused dropped_pkts member in softc.

With these changes em(4) should survive mbuf cluster allocation
failure on Rx path.

Reviewed by: pdeuskar, glebius (with improvements)


# 161265 13-Aug-2006 yongari

Apply alignment fixup only when programmed frame size is greater than
MCLBYTES - ETHER_ALIGN. Previously it applied the alignment fixup code
for oversized frames which would result in reduced performance on
strict alignment archs.


# 161205 11-Aug-2006 glebius

Merge in new driver from Intel, version 6.1.4. It adds support for
82571EB quad port copper NIC and has few minor fixes.

Details:
- if_em.c. Merged manually, viewing diff between new vendor
driver and previous one.
- if_em_hw.c. Dropped in from vendor, and then restored
revision 1.15.


# 161134 09-Aug-2006 pdeuskar

10/100 PHY shouldn't support gigabit media types.

Submitted by: brad (brad@comstyle.com)
Obtained from: OpenBSD
MFC after: 1 week


# 160964 04-Aug-2006 yar

Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR: misc/101245
Submitted by: Darren Pilgrim <darren pilgrim bitfreak org>
Tested by: md5(1)
MFC after: 1 week


# 160956 03-Aug-2006 pdeuskar

Revert back changes to made in rev 1.109 of if_em.c which were unnecessary.
This makes it easier for us to get the changes into -current and to -stable quickly.


# 160949 03-Aug-2006 glebius

Merge in new driver from Intel, version 6.0.5. It adds support for
80003 NICs and NICs found on ICH8 mobos, and improves support for
already known chips.

Details:
- if_em.c. Merged manually, viewing diff between new vendor
driver and previous one. This was an easy task, because
most changes between 5.1.5 and 6.0.5 are bugfixes taken
from FreeBSD.
- if_em_hw.h. Dropped in from vendor, and then restored
revisions 1.16, 1.17, 1.18.
- if_em_hw.c. Dropped in from vendor, and then restored
revision 1.15.
- if_em_osdep.h. Added new required macros from vendor file
and add a hack against define namespace mangling in
if_em_hw.h. Intel made another hack, but I prefer mine.


# 160734 26-Jul-2006 yongari

Prepending an mbuf after loading a DMA map results in unexpected
result. So, modify mbuf chains before loading a DMA map.


# 160733 26-Jul-2006 yongari

Nuke invalid use of BUS_DMA_ALLOCNOW.


# 160732 26-Jul-2006 yongari

Make sure to use the same DMA map in DMA map load/unload operations
by remembering a map used in bus_dmamap_load_mbuf_sg(9). I have
no idea how it could ever worked before.
This fixes a warning generated by a diagnostic check in sun4v
iommu driver.

Reported by: jb
Tested by: jb(sun4v)


# 160519 20-Jul-2006 yongari

Since resetting hardware takes a very long time and results in link
renegotiation, we only initialize the hardware only when it is
absolutely required. Process SIOCGIFADDR ioctl in em(4) when we know
an IPv4 address is added. Handling SIOCGIFADDR in a driver is
layering violation but it seems that there is no easy way without
rewritting hardware initialization code to reduce settle time after
reset.

This should fix a long standing bug which didn't send ARP packet when
interface address is changed or an alias address is added. Another
effect of this fix is it doesn't need additional delays anymore when
adding an alias address to the interface.
While I'm here add a new if_flags into softc which remembers current
prgroammed interface flags and make use of it when we have to program
promiscuous mode.

Tested by: Atanas <atanas AT asd DOT aplus DOT net>
Analyzed by: rwatson
Discussed with: -stable


# 160518 20-Jul-2006 yongari

Protect EEPROM access with the driver lock.


# 160517 20-Jul-2006 yongari

Honor IFF_DRV_OACTIVE in em_start_locked().


# 159330 06-Jun-2006 glebius

The procedure of raceless switching between polling mode and
taskqueued interrupt mode is going to be quite complex. Since
the polling mode is considered legacy feature for em(4) driver,
the decision is made to make polling and new interrupt handler
mutually exclusive, selected at compile time.

If kernel is compiled with DEVICE_POLLING, the fast taskqueued
interrupt handler code is disabled and the em_poll() and legacy
em_intr() functions are enabled. Otherwise, legacy functions
are disabled and only em_intr_fast() code is compiled.

Discussed with: scottl


# 157566 06-Apr-2006 glebius

Merge in new driver from Intel, version 5.1.5. Adds support for some
new chips and improves support for already supported ones.

Some details, important for future merges:
- if_em.c merged manually, viewing diff between new vendor
driver and previous one.
- if_em_hw.h dropped in from vendor, and then restored revisions
1.16, 1.17, 1.18.
- if_em_hw.c dropped in from vendor, and then two liner change made,
that restores support for two rare chips.


# 155911 22-Feb-2006 glebius

Back out 1.112,1.113. I don't have enough resources to fix breakages
introduced by this change.


# 155718 15-Feb-2006 glebius

Fix fallout from last commit - we need to program the MAC address in em_init().


# 155715 15-Feb-2006 glebius

em_hardware_init() in em_init() is not needed, and leads to annoying
link flap.

Submitted by: ru, Mike Tancsa


# 155713 15-Feb-2006 glebius

Set ifp->if_baudrate according to current speed.


# 155712 15-Feb-2006 glebius

- Rename em_print_link_status() to em_update_link_status().
- In em_attach() remove em_check_for_link(). Not needed here, since
already done in em_hardware_init().
- In em_attach() replace the printing block with call to
em_update_link_status().
- Remove modification of sc->link_state from em_hardware_init() and
from em_media_status(). This makes em_update_link_status() a
single point of change. Call em_update_link_status() where needed.


# 155709 15-Feb-2006 glebius

- Second style(9) megacleanup.
- Rename "adapter" to "sc"/"softc", to be like other drivers.

(-13 Kb less source code)


# 155674 14-Feb-2006 glebius

Move includes from if_em.h to if_em.c and sort them.


# 155472 09-Feb-2006 glebius

Fix two important typos in watchdog handling:

- Restart watchdog if we *did* processed any descriptors. [1]
- Log the watchdog event if the link is *up*. [2]

PR: kern/92948 [1]
Submitted by: Mihail Balikov <mihail.balikov interbgc.com> [1]
PR: kern/92895 [2]
Submitted by: Vladimir Ivanov <wawa yandex-team.ru> [2]


# 155426 07-Feb-2006 glebius

Since em(4) taskqueue is a new network context, we need to conditionally
lock Giant here.

Submitted by: Andrey V. Elsukov <bu7cher yandex.ru>


# 155052 30-Jan-2006 glebius

This driver can do hardware VLAN tagging + checksum offloading.

In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>


# 154954 28-Jan-2006 scottl

Squash another invalid use of BUS_DMA_ALLOCNOW.

MFC After: 3 days


# 154663 21-Jan-2006 mux

Fix a race condition by initializing the taskqueue before registering
the fast interrupt handler that uses it. This fixes a panic at boot
time when em_intr_fast() calls taskqueue_enqueue().


# 154571 20-Jan-2006 glebius

An attemp to make driver more readable and attaractive for further
hacking:
- Remove all spaces at eol.
- Improve style(9) in most frequently edited functions.
- In em_encap() push variables for 82544 workaround in the block
where they are only used.
- In em_get_buf() remove unused variable.


# 154333 13-Jan-2006 scottl

Add the following to the taskqueue api:

taskqueue_start_threads(struct taskqueue **, int count, int pri,
const char *name, ...);

This allows the creation of 1 or more threads that will service a single
taskqueue. Also rework the taskqueue_create() API to remove the API change
that was introduced a while back. Creating a taskqueue doesn't rely on
the presence of a process structure, and the proc mechanics are much better
encapsulated in taskqueue_start_threads(). Also clean up the
taskqueue_terminate() and taskqueue_free() functions to safely drain
pending tasks and remove all associated threads.

The TASKQUEUE_DEFINE and TASKQUEUE_DEFINE_THREAD macros have been changed
to use the new API, but drivers compiled against the old definitions will
still work. Thus, recompiling drivers is not a strict requirement.


# 154291 13-Jan-2006 scottl

Fix the interrupt race for real. Don't register the interrupt until after
the the interface has been configured. I'm not sure how this could ever
have worked before, but it should be fixed now. Also break out the interrupt
degresitration function into it's own step.


# 154286 13-Jan-2006 scottl

Disable interrupts while we are setting up the handler. The interrupt really
shouldn't be set up or enabled until much later, but that will be investigated
at a later time.


# 154204 10-Jan-2006 scottl

Significant performance improvements for the if_em driver:

- Only update the rx ring consumer pointer after running through the rx loop,
not with each iteration through the loop.
- If possible, use a fast interupt handler instead of an ithread handler. Use
the interrupt handler to check and squelch the interrupt, then schedule a
taskqueue to do the actual work. This has three benefits:
- Eliminates the 'interrupt aliasing' problem found in many chipsets by
allowing the driver to mask the interrupt in the NIC instead of the
OS masking the interrupt in the APIC.
- Allows the driver to control the amount of work done in the interrupt
handler. This results in what I call 'adaptive polling', where you get
the latency benefits of a quick response to interrupts with the
interrupt mitigation and work partitioning of polling. Polling is still
an option in the driver, but I consider it orthogonal to this work.
- Don't hold the driver lock in the RX handler. The handler and all data
associated is effectively serialized already. This eliminates the cost of
dropping and reaquiring the lock for every receieved packet. The result
is much lower contention for the driver lock, resulting in lower CPU usage
and lower latency for interactive workloads.

The amount of work done in the taskqueue is controlled by the sysctl
dev.em.N.rx_processing_limit

and tunable
hw.em.rx_process_limit

Setting these to -1 effectively removes the limit.

The fast interrupt and taskqueue can be disabled by defining NO_EM_FASTINTR.
This work has been shown to increase fast-forwarding from ~570 kpps to
~750 kpps (note that the same NIC hardware seems unable to transmit more than
800 kpps, so this increase appears to be limited almost solely by the
hardware). Gains have been shown in other workloads, ranging from better
performance to elimination of over-saturation livelocks.

Thanks to Andre Opperman for his time and resources from his network
performance project in performing much of the testing. Thanks to Gleb
Smirnoff and Danny Braniss for their help in testing also.


# 153783 28-Dec-2005 glebius

A style nit.


# 153781 28-Dec-2005 glebius

Tidy up em_resume():
- Don't call em_init_locked() twice.
- Collapse two if() blocks into one.


# 153729 26-Dec-2005 glebius

Add simple suspend and resume methods. We call em_stop() on suspend and
em_init() on resume. With this change the network is ready right after
resume, without half minute lag.

Tested by: Jacques Garrigue


# 153635 22-Dec-2005 glebius

Add a quirk to fix resume on some laptops.

Reported by: joe
Reported by: Huang wen hui <huang gddsn.org.cn>
Reported by: Jacques Garrigue <garrigue math.nagoya-u.ac.jp>
PR: kern/89825


# 153512 18-Dec-2005 glebius

- Fix VLAN_INPUT_TAG() macro, so that it doesn't touch mtag in
case if memory allocation failed.
- Remove fourth argument from VLAN_INPUT_TAG(), that was used
incorrectly in almost all drivers. Indicate failure with
mbuf value of NULL.

In collaboration with: yongari, ru, sam


# 153474 16-Dec-2005 yongari

Add jumbo frame support for architectures with strict alignment.

Reviewed by: glebius


# 153012 02-Dec-2005 glebius

On the 82571 and newer chipset the ICR register is meaningful only
if the E1000_ICR_INT_ASSERTED bit is set.

Submitted by: Jack Vogel


# 152774 24-Nov-2005 cognet

Remember the bus_dmamap_t where we loaded the mbuf, and sync this map instead
of tx_buffer->map, or we could end up syncing the wrong map.


# 152740 23-Nov-2005 glebius

Merge in new driver version from Intel - 3.2.18.

The most important change is support for adapters based on
82571 and 82572 chips.

Tested on: 82547EI on i386
Tested on: 82540EM on sparc64


# 152645 21-Nov-2005 yongari

busdma cleanup for em(4).
- don't force busdma to pre-allocate bounce pages for parent tag.
- use system supplied roundup2 macro instead of rolling its own version.
- TX/RX decriptor length should be multiple of 128. There is no
no need to expand the size with the multiple of 4096.
- don't create/destroy DMA maps in TX/RX handlers. Use pre-allocated
DMA maps. Since creating DMA maps on sparc64 is time consuming
operations(resource mananger overhead), this change should boost
performance on sparc64. I could get > 2x speedup on Ultra60.
- TX/RX descriptors could be aligned on 128 boundary. Aligning them
on PAGE_SIZE is waste of resource.
- don't blindly create TX DMA tag with size of MCLBYTES * 8. The size
is only valid under jumbo frame environments. Instead of using the
hardcoded value, re-compute necessary size on the fly.
- RX side bus_dmamap_load_mbuf_sg(9) support.
- remove unused macro EM_ROUNDUP and constant EM_MMBA.

Reviewed by: scottl
Tested by: glebius


# 152545 17-Nov-2005 glebius

- Backout last change, since it is memory overkill for a non busy host or
for a notebook with em(4) adapter.
- Introduce tunables em.hw.txd and em.hw.rxd, which allow administrator
to configure number of transmit and receive descriptors.
- Check em.hw.txd and em.hw.rxd against hardware limits [*] and require
them to be multiple of 128.

[*] According to comments in if_em.h the 82540EM/82541ER chips can handle
more than 256 descriptors. Since we don't have this hardware to test,
we decided to mimic NetBSD wm(4) driver, that limits these chips to
256 descriptors.

In collaboration with: yongari


# 152315 11-Nov-2005 ru

- Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.


# 152276 10-Nov-2005 glebius

Give a try to autoconfiguring the number of transmit and receive
descriptors depending on chip revision.


# 152247 09-Nov-2005 glebius

- Introduce two more stat counters, counting number of RX
overruns and number of watchdog timeouts.
- Do not log(9) RX overrun events, since this pessimizes
things under load [1].
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().

Submitted by: ade [1]


# 152225 09-Nov-2005 yongari

Make em(4) work on big-endian architectures.
- disable jumbo frame support on strict alignment architectures due
to the limitation of hardware. The driver needs a fix-up code for
RX side. The fix will show up in near future.
- fix endian issue for 82544 on PCI-X bus. I couldn't test this as
I don't have the NIC/hardware.
- prefer PCIR_BAR to hardcoded EM_MMBA.
- Properly checks for for 64bit BAR [1]
- replace inl/outl with bus_space(9) [1]
- fix endian issue on VLAN handling.
- reorder header files and remove unnecessary one.

Reviewed by: cognet
No response from: pdeuskar, tackerman
Obtained from: OpenBSD [1]


# 151903 31-Oct-2005 rwatson

Put probe-time printf of adapter speed and duplex behind bootverbose:
since the link takes a bit to negotiate, the information is pretty
much never available during the probe. As such, the boot output
pretty much always prints N/A for speed and duplex. Since we print
out the output of ifconfig during the user space boot, this early
boot information is also generally redundant, and added to the noise.

MFC after: 2 weeks


# 151495 20-Oct-2005 glebius

Some more minor cleanups of em(4) driver:
- Destroy mutex in case of attach failure. [1]
- Lock properly em_watchdog(). [1]
- Lock properly em_sysctl_int_delay(). [1]
- Remove unused global adapter linked list.
- Remove unused dma_size field from struct em_dma_alloc.
- Do not touch interface statistics, that must be edited
only by upper layers. [1]

Submitted by: yongari [1]


# 151494 20-Oct-2005 glebius

Revamp interrupt handling in em(4) driver:

o Do not mask the RX overrun interrupt.

o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact. [ NetBSD also
resets adapter in this case, but my tests showed that
this is not needed and only pessimizes behavior under
heavy load. ]
- Since almost all functions is rewritten, style the
remaining lines.

This fixes em(4) interfaces wedging under high load.

In collaboration with: wpaul, cognet
Obtained from: NetBSD


# 151466 19-Oct-2005 glebius

In the em_process_receive_interrupts() cycle check the IFF_DRV_RUNNING
flag. This fixes panic, when 'ifconfig em0 down' was called and it calls
em_stop() while the em_process_receive_interrupts() has temporarily
dropped the lock.


# 151432 17-Oct-2005 cognet

- Use BUS_DMASYNC_PREWRITE in em_get_buf(), as the adapter is about to read
the descriptors set.
- In em_process_receive_interrupts(), call bus_dmamap_sync() for the
descriptors set each time we modify one descriptor, instead of doing it only
at the function exit, to make sure the adapters know he can re-use the
descriptor.
This helps on arm with write-back data cache (and possibly on other arches
with bounce pages, I don't know) under heavy network load. Without this,
if we attempt to process more than num_rx_desc descriptors, the adapter
would just stop processing rx interrupts.


# 151314 14-Oct-2005 glebius

From the PR:

The receive function em_process_receive_interrupts() unlocks the
adapter while ether_input() processes the packet, and then locks
it back. In the meantime, em_init() may be called, either from
em_watchdog() from softclock interrupt or from the ifconfig(8)
program. The em_init() resets the card, in particular it sets
adapter->next_rx_desc_to_check to 0 and resets hardware RX Head
and Tail descriptor pointers. The loop in
em_process_receive_interrupts() does not expect these things to
change, and a mess may result.

This fixes long wedges of em(4) interfaces receive part under high
load and IP fastforwarding enabled.

PR: kern/87418
Submitted by: Dmitrij Tejblum <tejblum yandex-team.ru>


# 151312 14-Oct-2005 glebius

Cleanup from __FreeBSD_version.


# 150968 05-Oct-2005 glebius

- Don't pollute opt_global.h with DEVICE_POLLING and introduce
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
can be compiled as loadable modules.

Reviewed by: bde


# 150789 01-Oct-2005 glebius

Big polling(4) cleanup.

o Axe poll in trap.

o Axe IFF_POLLING flag from if_flags.

o Rework revision 1.21 (Giant removal), in such a way that
poll_mtx is not dropped during call to polling handler.
This fixes problem with idle polling.

o Make registration and deregistration from polling in a
functional way, insted of next tick/interrupt.

o Obsolete kern.polling.enable. Polling is turned on/off
with ifconfig.

Detailed kern_poll.c changes:
- Remove polling handler flags, introduced in 1.21. The are not
needed now.
- Forget and do not check if_flags, if_capenable and if_drv_flags.
- Call all registered polling handlers unconditionally.
- Do not drop poll_mtx, when entering polling handlers.
- In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
- In netisr_poll() axe the block, where polling code asks drivers
to unregister.
- In netisr_poll() and ether_poll() do polling always, if any
handlers are present.
- In ether_poll_[de]register() remove a lot of error hiding code. Assert
that arguments are correct, instead.
- In ether_poll_[de]register() use standard return values in case of
error or success.
- Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
poll_switch() goes through interface list and enabled/disables polling.
A message that kern.polling.enable is deprecated is printed.

Detailed driver changes:
- On attach driver announces IFCAP_POLLING in if_capabilities, but
not in if_capenable.
- On detach driver calls ether_poll_deregister() if polling is enabled.
- In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
flag. If there is no, then unlocks and returns.
- In ioctl handler driver checks for IFCAP_POLLING flag requested to
be set or cleared. Driver first calls ether_poll_[de]register(), then
obtains driver lock and [dis/en]ables interrupts.
- In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
If present, then returns.This is important to protect from spurious
interrupts.

Reviewed by: ru, sam, jhb


# 150710 29-Sep-2005 glebius

In em_process_receive_interrupts() store and clear adapter->fmt. This
make function reenterable. In the runtime the race is masked by serializing
of em_process_receive_interrupts() either by interrupt thread, or by
polling. The race can be triggered when polling is switched on or off.


# 150388 20-Sep-2005 glebius

Remove queue check from last commit. In most cases there is smth in queue,
when start function is called.

Reviewed by: ru


# 150380 20-Sep-2005 glebius

Check IFF_DRV_RUNNING and presense of packets in queue before calling
em_start_locked(). This fixes panic on shutdown with active traffic
passing through router.

Sponsored by: Rambler


# 150306 19-Sep-2005 imp

Make sure that we call if_free(ifp) after bus_teardown_intr. Since we
could get an interrupt after we free the ifp, and the interrupt
handler depended on the ifp being still alive, this could, in theory,
cause a crash. Eliminate this possibility by moving the if_free to
after the bus_teardown_intr() call.


# 150124 14-Sep-2005 ru

Fix "Memory modified after free" panic on detach, caused by accessing
already freed struct ifnet.


# 148887 09-Aug-2005 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days


# 148654 02-Aug-2005 rwatson

Modify device drivers supporting multicast addresses to lock if_addr_mtx
over iteration of their multicast address lists when synchronizing the
hardware address filter with the network stack-maintained list.

Problem reported by: Ed Maste (emaste at phaedrus dot sandvine dot ca>
MFC after: 1 week


# 148636 02-Aug-2005 ru

Add missing ether_poll_deregister(). This is still not enough to
kldunload/kldload without a panic. The same (but worse) problem
is also present in ixgb(4).


# 147256 10-Jun-2005 brooks

Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.

Reviewed by: sobomax, sam


# 146662 26-May-2005 tackerman

Changes to update driver with latest Intel driver version 2.1.7
- Changed from using explicit devices id to using descriptive labels.
- Added support for 82573 and 82546 Quad adapters.
- Corrected support for 82547EI and 82541ER (mac_type was not assigned)
- Removed #ifdef DBG_STATS and extraneous code.

if_em_hw.c/if_em_hw.h
- Added support for 82573 and 82546 Quad adapters.
- Brought forward Intel's most current mac and phy changes.


# 144652 05-Apr-2005 glebius

Run em_local_timer() once per second instead of running it once per 2 seconds.
This makes gathering of error stats more precise, and netstat(1) output look
right.

Reviewed by: tackerman


# 143161 05-Mar-2005 imp

Use BUS_PROBE_DEFAULT for pci probe return value


# 141298 04-Feb-2005 glebius

Call if_link_state_change() when link status changes.

PR: kern/76890
Reviewed by: rwatson, sam


# 140872 26-Jan-2005 yar

Forced commit to note that in the previous commit message
(rev#1.59) I mistyped references to rev#1.58 as `rev#1.85'.
Shame on me.

Pointed out by: will


# 140859 26-Jan-2005 yar

Respect the current setting of IFCAP_VLAN_HWTAGGING on
the interface when going to toggle VLAN support for
internal reasons. If the IFCAP_VLAN_HWTAGGING bit is
cleared, we should rely on the (re)init routine to turn
VLAN support off and never touch the relevant hardware bits.

This applies to other capability bits, too. The user
obviously has a reason for clearing a capability bit,
e.g., if his particular NIC is buggy and hangs if a
certain hardware capability is turned on even for a
fraction of a second.

The flag adapter->em_insert_vlan_header still is set or
reset irrespective of the IFCAP_VLAN_HWTAGGING setting,
as before, in order to handle the case when a user sets
promiscuous mode on an interface first and later turns
its IFCAP_VLAN_HWTAGGING bit on.

This change might look orthogonal to rev#1.85, but in fact
it is not. It introduces bugfixes that hopefully will make
implementing the general scheme mentioned in the commit
message of rev#1.85 easier.


# 140857 26-Jan-2005 rwatson

Disable use of hardware VLAN tagging and stripping in if_em in the default
configuration: it appears to work properly in the non-promiscuous case, but
we've not yet implemented a more general solution that maintains full
functionality with promiscuous mode enabled. While my hope is that we can
get one implemented soon, this will improve functionality substantially in
the mean time.

MFC after: 3 days


# 140318 15-Jan-2005 scottl

Convert if_em to the new bus_dmamap_load_sg() interface. The old callback
was really just a waste of cycles, so this streamlines it considerably.


# 139549 01-Jan-2005 tackerman

Corrected a workaround that should only be applied to one adapter. Workaround
was causing device hangs when incorrectly applied to other adapters.

PR: kern/66634


# 139548 01-Jan-2005 tackerman

Added device id support for Intel 82541ER and 82546GB dual port PCIE adapter.

PR: None


# 137700 14-Nov-2004 rwatson

Further refine the if_em vlan fix in if_em.c:1.53:

- Because em_encap() can now fail in a way that leaves us without an
mbuf chain, potentially set *m_headp to NULL if that happens, so that
the caller can do the right thing. This case can occur when we try
to prepend the vlan header mbuf but can't allocate additional memory.

- Modify the caller of em_encap() to detect a NULL m_head and not try
to queue the mbuf if that happens.

- When em_encap() fails, make sure to call bus_dmamap_destroy() to
clean up.


# 137609 12-Nov-2004 rwatson

Correct a bug in the if_em driver relating to the use of vlans with
promiscuous mode introduced in 1.45, which programs the em card not
to strip or prepend tags when in promiscuous mode without also
modifying behavior to manually prepend a vlan header in the event
that the card isn't doing it on transmit. Due to a feature of card
operation, if the global VLAN prepend/strip register isn't set,
setting the VLAN tag flag on individual packet descriptors will
cause the packet to be transmitted using ISL encapsulation rather
than 802.1Q VLAN encapsulation.

This fix causes em_encap() to prepend the header by tracking whether
the card is configured to temporarily disable prepending/stripping
due to promiscuous mode. As a result, entering promiscuous mode on
the parent em interface no longer causes vlans to appear to "wedge"
or transmit ISL-encapsulated frames, which typically will not be
configured/spoken by the other endpoints on the VLAN trunk. This
bug may also exist in other drivers, and the additional vlan
encapsulation logic should be abstracted and centralized in
if_vlan.c if so.

RELENG_5_3 candidate.

MFC after: 1 week
Tested by: pjd, rwatson
Reported by: astesin at ukrtelecom dot net
Reported by: Mike Tancsa <mike at sentex dot net>
Reported by: Iasen Kostov <tbyte at OTEL dot net>


# 137575 11-Nov-2004 bms

Move per-instance sysctls under the per-device-instance tree.

Reviewed by: mux
Prodded by: rwatson


# 137155 03-Nov-2004 phk

Put the "Link is up/down" printfs behind bootverbose. gigE is not so uncommon
that we need to tell people about every cable in the network anymore. It can
be enabled for debugging purposes with "boot -v".


# 136718 19-Oct-2004 mux

Add missing bus_dmamap_sync() calls. If you are using an architecture
with a weak memory model or x86 + PAE (or more specifically, your
driver is using bounce pages) and you have had problems with em(4),
this may fix it. At least this is needed to have em(4) work properly
on FreeBSD/arm.

Original version by: cognet
Reviewed by: tackerman
Tested by: cognet


# 136687 19-Oct-2004 scottl

Forced commit to note that the previous change also elimates calls to
bus_dmamap_create|destroy for the rx and tx descriptor buffers. Since these
buffers are created with bus_dmamem_alloc(), there is no reason to also
create a map, and doing so just wastes memory.


# 136685 19-Oct-2004 scottl

Use an alignment of 1 instead of PAGE_SIZE for the rx and tx buffer tags.
Since the e1000 DMA engines hava no constraints on the alignment of buffer
transfers, there is no reason to tell busdma that there is. This save a
minimum of 1 malloc call per packet, which translates to eliminating 4 locks.
It also means that buffers are not needlessly bounced when transfered. The
end result is a 38% improvement in pps in a 4 way bridging environment.

Obtained from: Sandvine, Inc.


# 136300 09-Oct-2004 scottl

Don't count RNBC (internal buffer full) towards the RX error count since it's
not really an error.

Submitted by: Gerrit Nagelhout


# 135937 29-Sep-2004 mlaier

Fix typeo. Should read ***!***IFQ_DRV_IS_EMPTY.
This might fix some of the trouble around em(4) filling up its buffers.

Submitted by: mtm
Pointy hat to: mlaier
MFC after: 2 days


# 134619 01-Sep-2004 pdeuskar

Added support for Intel PRO/1000 GT Desktop Adapter(Device ID 8086 107C)
Removed support for Intel 82541ER
Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
Added fix for vlan tagged frames not being properly bridged.
Corrected TBI workaround.
Corrected incorrect LED operation issues

Submitted by: tackerman (Tony Ackerman)
MFC after: 2 weeks


# 131455 02-Jul-2004 mlaier

Bring in the first chunk of altq driver modifications. This covers the
following drivers: bfe(4), em(4), fxp(4), lnc(4), tun(4), de(4) rl(4),
sis(4) and xl(4)

More patches are pending on: http://peoples.freebsd.org/~mlaier/ Please take
a look and tell me if "your" driver is missing, so I can fix this.

Tested-by: many
No-objection: -current, -net


# 130079 04-Jun-2004 yar

Implement support for controlling VLAN_HWTAGGING through ioctl(SIOCSIFCAP).
This includes not only toggling the flag in if_capenable, but also really
reconfiguring the hardware.

Approved by: tackerman (as the em(4) maintainer)


# 129616 23-May-2004 mux

We don't need to initialize if_output, ether_ifattach() does it
for us.


# 129481 20-May-2004 yar

Stylistic changes around the previous commit:

- since the number of supported capabilities is growing,
set bits in if_cap* in a consistent way;

- unexpand(1) leading SPACE characters.


# 129479 20-May-2004 yar

Set the VLAN bits in if_capenable as well as in if_capabilities
because VLAN hardware features are enabled in em(4) by default.

Note: Currently vlan(4) has a bug that it consults
if_capabilities, not if_capenable. This will be fixed
after all the network drivers set VLAN bits in
if_capenable properly.


# 128139 11-Apr-2004 ru

Implemented per-interface polling(4) control.


# 127135 17-Mar-2004 njl

Convert callers to the new bus_alloc_resource_any(9) API.

Submitted by: Mark Santcroos <marks@ripe.net>
Reviewed by: imp, dfr, bde


# 125673 10-Feb-2004 pdeuskar

Only reset the phy when it is absolutely required.
This should fix the issues with long *init* times when
you do ifconfig em0 alias.

MFC after: 3 days


# 123225 07-Dec-2003 deischen

Don't call em_stop() from the watchdog since it requires the controller
mutex to be locked. It is redundant since em_init() is called and this
correctly locks the mutex and calls em_stop().

5.2 release candidate since this can cause a panic if the watchdog
expires.

Tested by: kuriyama


# 123115 02-Dec-2003 pdeuskar

Use if_flags to check for IFF_POLLING instead of if_ipending.

Submitted by: jroberson (Jeff Roberson)
Approved by: re (scottl)


# 122681 14-Nov-2003 pdeuskar

- Code cleanup
- In the receive routine handle the case where last descriptor could have
less than 4 bytes of data.
- Handle race between detach/ioctl routine.

MFC after: 3 days


# 121816 31-Oct-2003 brooks

Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)


# 121106 15-Oct-2003 deischen

Add a wrapper for a function that takes and releases the adapter
lock around a call to the original function. Make the timeout
function in callout_reset() use the wrapped function to avoid a
lock assertion panic.

Reviewed by: sam
Reported by: cgiordano@ids.net


# 120989 10-Oct-2003 sam

locking fixups:

o correct recursive locking when polling and in em_82547_move_tail
o destroy mutex on detach
o add EM_LOCK_ASSERT and similar macros for creating+deleteing the mtx

Submitted by: Daniel Eischen <eischen@vigrid.com>


# 120364 22-Sep-2003 sam

add locking

Reviewed by: Prafulla Deuskar <pdeuskar@FreeBSD.ORG>
Sponsored by: FreeBSD Foundation


# 119509 27-Aug-2003 pdeuskar

Add support for new devices.
Bug Fixes:
- Allow users to use LAA
- Remember promiscuous mode settings while bridging
- Allow gratuitous arp's to be sent

PR: 52966/54488
MFC after: 1 week


# 118314 01-Aug-2003 jdp

Add facilities for tuning the "em" driver's interrupt delays without
recompiling the driver. See the comments near the top of "if_em.h"
for descriptions of these delays. Four new loader tunables control
the system-wide default values:

hw.em.tx_int_delay
hw.em.rx_int_delay
hw.em.tx_abs_int_delay
hw.em.rx_abs_int_delay

The tunables are specified in microseconds. The valid range is
0-67108 usec., and 0 means that the timer is disabled.

There are also four new sysctls (actually, a set of four for each
"em" device in the system) to query and change the interrupt delays
after the system is up:

hw.em0.tx_int_delay
hw.em0.rx_int_delay
hw.em0.tx_abs_int_delay (not present for 82542/3/4 adapters)
hw.em0.rx_abs_int_delay (not present for 82542/3/4 adapters)

It seems to be OK to change these values even while the adapter is
passing traffic.

Approved by: Prafulla Deuskar <pdeuskar@FreeBSD.ORG>
MFC after: 4 weeks


# 117126 01-Jul-2003 scottl

Mega busdma API commit.

Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.

sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.

If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.

Reviewed by: tmm, gibbs


# 115878 05-Jun-2003 pdeuskar

Add support for Quad port adapter
Add sysctl's to display statistics/debug_info
Set WAIT_FOR_AUTONEG_DEFAULT to zero by default
Increment packet in/out statistics inline instead of every two seconds.

MFC after: 3 days


# 114776 06-May-2003 des

Fix a printf() format error which broke the ia64 GENERIC build.


# 114567 03-May-2003 pdeuskar

- Fix breakage on PAE enabled kernel
- Don't use vtophys when you can get physical address using bus_dma API

Submitted by: jake (Jake Burkholder)


# 114554 02-May-2003 pdeuskar

- Bus DMA'fy the driver
- Use htole* macros where appropriate so that the driver could work on non-x86 architectures
- Use m_getcl() instead of MGETHDR/MCLGET macros
Submitted by: sam (Sam Leffler)


# 113673 18-Apr-2003 pdeuskar

Tell the upper layer(s) that we support long frames.
Not doing this caused the vlan mtu to be reduced by 4 bytes.

Submitted by: Doug Ambrisko (ambrisko)
MFC after: 1 day


# 113506 15-Apr-2003 mdodd

- Express hard dependencies on bus (pci, isa, pccard) and
network layer (ether).
- Don't abuse module names to facilitate ifconfig module loading;
such abuse isn't really needed. (And if we do need type information
associated with a module then we should make it explicit and not
use hacks.)


# 112472 21-Mar-2003 pdeuskar

Added support for 82541 and 82547 based adapters.
- These have Intel gigabit PHY
- 82547 uses CSA interface

MFC after: 1 week


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 108229 23-Dec-2002 pdeuskar

- Move to array based indexing for TX/RX descriptor/buffer management
- Added support for ITR (interrupt throttle register). This feature is available on
adapters based on 82545 and above
- Fixed problem with vlan support when traffic has priority bits set. (kern/45907)

PR: kern/45907
MFC after: 1 week


# 107243 25-Nov-2002 luigi

Fix IFF_ALLMULTI handling.

Reviewed by: pdeuskar (maintainer)
Approved by: re


# 107242 25-Nov-2002 luigi

Add polling support to the "em" driver.

Reviewed by: pdeuskar (maintainer)
Approved by: re


# 106937 14-Nov-2002 sam

network interface driver changes:

o don't strip the Ethernet header from inbound packets; pass packets
up the stack intact (required significant changes to some drivers)
o reference common definitions in net/ethernet.h (e.g. ETHER_ALIGN)
o track ether_ifattach/ether_ifdetach API changes
o track bpf changes (use BPF_TAP and BPF_MTAP)
o track vlan changes (ifnet capabilities, revised processing scheme, etc.)
o use if_input to pass packets "up"
o call ether_ioctl for default handling of ioctls

Reviewed by: many
Approved by: re


# 106649 08-Nov-2002 pdeuskar

- Set RS (Report Status) bit on all descriptors of a packet instead of just the last one.
- Set RDTR to zero by default instead of 28.
- Fixed a problem with TX hangs with jumbo frames when number of fragments in the mbuf chain
is large.
- Added support for 82540EP based cards.

MFC after: 3 days


# 103895 24-Sep-2002 pdeuskar

Corrected license in the source files. It should say "MUST" instead of "MAY".

MFC after: 2 days


# 102452 26-Aug-2002 pdeuskar

Back out TX/RX descriptor/buffer management changes from earier commit.
We are having panics with the driver under stress with jumbo frames.
Unfortunately we didnot catch it during our regular test cycle.
I am going to MFC the backout immediately.


# 102242 21-Aug-2002 pdeuskar

TX/RX descriptor/buffer management changes.
Use array based scheme instead of queueing macros.

Submitted by: Luigi Rizzo (rizzo@icir.org)
MFC after: 3 days


# 100184 16-Jul-2002 pdeuskar

- Use IO mode to reset the controller (82544 and beyond)
- Read the Mac address only once during attach.
(This fixes the failover issue observed using the bonding driver)

MFC after: 3 days


# 97785 03-Jun-2002 pdeuskar

Added support for 82545EM and 82546EB based adapters.
Added Vlan support.

MFC after: 1 week


# 95962 02-May-2002 pdeuskar

Make em driver compilable on IA64/alpha.

Submitted by: peter
MFC after: 3 days


# 95673 28-Apr-2002 phk

Follow NetBSD and s/IFM_1000_TX/IFM_1000_T/


# 93914 05-Apr-2002 pdeuskar

Added support for 82540EM based cards.
Cosmetic changes to make code more unix-like.

MFC after: 1 week


# 92739 20-Mar-2002 alfred

Remove __P.


# 90628 13-Feb-2002 pdeuskar

- Added support for receive in multiple
descriptors. This simplifies code for jumbo frames.
- Cleaned up coding conventions to make code more unix-like.
- Cleaned up code in if_em_fxhw.c and if_em_phy.c.
Added relevant comments.

MFC after: 1 week


# 87450 06-Dec-2001 pdeuskar

Fixed two problems:
1. Changed incorrect conditional in fxhw.c which would never
evaluate to true. Thanks to John Polstra for pointing that out.
2. Write to PCI config space by default, enabling memory access and
bus master enable.

Submitted by:Prafulla Deuskar
MFC after:3 days


# 87189 02-Dec-2001 pdeuskar

This is the first commit of the Intel gigabit driver for
PRO/1000 cards.

Submitted by:Prafulla Deuskar
Reviewed by: Paul Saab
MFC after:1 week