History log of /openbsd-current/sys/dev/pci/if_bnx.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.133 10-Nov-2023 bluhm

Make ifq and ifiq interface MP safe.

Rename ifq_set_maxlen() to ifq_init_maxlen(). This function neither
uses WRITE_ONCE() nor a mutex and is called before the ifq mutex
is initialized. The new name expresses that it should be used only
during interface attach when there is no concurrency.

Protect ifq_len(), ifq_empty(), ifiq_len(), and ifiq_empty() with
READ_ONCE(). They can be used without lock as they only read a
single integer.

OK dlg@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.132 11-Mar-2022 mpi

Constify struct cfattach.


# 1.131 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.129 10-Jul-2020 patrick

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.132 11-Mar-2022 mpi

Constify struct cfattach.


# 1.131 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.129 10-Jul-2020 patrick

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.131 09-Jan-2022 jsg

spelling
feedback and ok tb@ jmc@ ok ratchov@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.129 10-Jul-2020 patrick

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.130 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.129 10-Jul-2020 patrick

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.129 10-Jul-2020 patrick

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.128 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.127 17-May-2020 jsg

fix typo in a comment

from Delyan Raychev


Revision tags: OPENBSD_6_7_BASE
# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


# 1.126 06-Dec-2019 dlg

enable the full use of jumbos and remove IFCAP_VLAN_MTU.

The chip can do 9008 byte packets (not including the ethernet
header), but only 9004 if you want to enable IFCAP_VLAN_MTU. by not
enabling IFCAP_VLAN_MTU, we let other protocols (eg, mpls or svlan)
use the extra bytes if they want.

using the extra bytes for the hardmtu instead of for IFCAP_VLAN_MTU
works a bit better with how aggr(4) is set up at the moment because
aggr does not pass IFCAP_VLAN_MTU through from its ports, which
means vlan(4) on aggr(4) cannot see the flag and use the extra
bytes.

this was figured out by hrvoje popovski in a discussion with pedro
caetano on the "issues configuring vlan on top of aggr device" on
misc@.
hrvoje also tested the diff and made sure the full use of jumbos
works for things like ping packets with DF set.
jmatthew skimmed the diff and didnt see anything obviously wrong too


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.125 10-Mar-2018 sthen

raise bnx(4)'s rxring lwm to 16, ok deraadt

(I've had this diff locally for a long time on port build machines to
avoid NFS stalls.)


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.124 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.123 22-Jan-2017 dlg

move counting if_opackets next to counting if_obytes in if_enqueue.

this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.

ok mpi@ deraadt@


Revision tags: OPENBSD_6_0_BASE
# 1.122 05-May-2016 jmatthew

r1.10 of if_bnx.c effectively removed the limit on the number of segments in
the tx dma maps, apparently to allow heavily fragmented packets to be sent.

The tx ring accounting in bnx_start assumed that the longest fragment chain
we'd see was BNX_MAX_SEGMENTS, so sending a heavily fragmented packet when the
ring was already full could cause it to overflow.

In the 10 years since r1.10, we've started defragmenting packets if they
won't fit in the dma map, so we can limit the maps to BNX_MAX_SEGMENTS again.
While we're here, ensure there's always at least one slot on the tx ring free,
for consistency between drivers.

Fixes packet corruption seen by otto@
ok mpi@ dlg@


# 1.121 13-Apr-2016 mpi

G/C IFQ_SET_READY().


Revision tags: OPENBSD_5_9_BASE
# 1.120 11-Dec-2015 mpi

branches: 1.120.2;
Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


# 1.119 10-Dec-2015 dlg

mark bnx_start as mpsafe.

tweak it to use ifq_restart so ifq_clr_oactive is serialised with start.

ok jmatthew@


# 1.118 05-Dec-2015 jmatthew

Make the bnx interrupt handler mpsafe, and perform rx and tx completion
outside the kernel lock.

Remove tx descriptor lists (essentially backing out if_bnx.c r1.77),
add an interrupt barrier in bnx_stop, check the rx ring state before receiving
packets, adjust the tx counter with atomic operations, and rework bnx_start
to check for ring space before dequeueing and drop the packet if bnx_encap
fails.

tested on BCM5708 by me and on BCM5709 by Hrvoje Popovski
ok dlg@


# 1.117 25-Nov-2015 dlg

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@


# 1.116 20-Nov-2015 dlg

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@


# 1.115 25-Oct-2015 mpi

arp_ifinit() is no longer needed.


# 1.114 10-Sep-2015 deraadt

sizes for free(); ok sthen


# 1.113 04-Sep-2015 kettenis

The bnx_tx_pool gets used from interrupt context, so drop the explicit
backend allocoter here without passing PR_WAITOK to pool_init(9).

ok mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.112 24-Jul-2015 dlg

if we free the mbuf in the rx path, clear the pointer to it so we dont
try and queue it for the stack and cause a use after free.

found by maxime villard and brainy


# 1.111 24-Jun-2015 mpi

Increment if_ipackets in if_input().

Note that pseudo-drivers not using if_input() are not affected by this
conversion.

ok mikeb@, kettenis@, claudio@, dlg@


# 1.110 10-Mar-2015 mpi

Convert to if_input().

Tested and ok sthen@, ok dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.109 27-Jan-2015 dlg

remove the second void * argument on tasks.

when workqs were introduced, we provided a second argument so you
could pass a thing and some context to work on it in. there were
very few things that took advantage of the second argument, so when
i introduced pools i suggested removing it. since tasks were meant
to replace workqs, it was requested that we keep the second argument
to make porting from workqs to tasks easier.

now that workqs are gone, i had a look at the use of the second
argument again and found only one good use of it (vdsp(4) on sparc64
if you're interested) and a tiny handful of questionable uses. the
vast majority of tasks only used a single argument. i have since
modified all tasks that used two args to only use one, so now we
can remove the second argument.

so this is a mechanical change. all tasks only passed NULL as their
second argument, so we can just remove it.

ok krw@


# 1.108 22-Dec-2014 tedu

unifdef INET


Revision tags: OPENBSD_5_6_BASE
# 1.107 18-Jul-2014 dlg

implement EFBIG handling for heavily fragmented packets on the tx path.

ok claudio@


# 1.106 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.105 09-Jul-2014 dlg

dont try to be smart about avoiding the use of too many descriptors
when filling the rx ring. trust the hwm.

problem found by sthen@


# 1.104 08-Jul-2014 dlg

cut things that relied on mclgeti for rx ring accounting/restriction over
to using if_rxr.

cut the reporting systat did over to the rxr ioctl.

tested as much as i can on alpha, amd64, and sparc64.
mpi@ has run it on macppc.
ok mpi@


Revision tags: OPENBSD_5_5_BASE
# 1.103 30-Oct-2013 dlg

replace the workq bits to supply new tx pkt descriptors with a task.

tested locally on a dell poweredge 2950


# 1.102 23-Oct-2013 brad

Enable TX checksum offload.

ok naddy@ sthen@


Revision tags: OPENBSD_5_4_BASE
# 1.101 28-Mar-2013 brad

Let mii_attach() know where the PHY is located instead of scanning
for it since we know where it will be anyway and remove the code
from the MII bus read/write functions to force reading/writing
from the predetermined location. Copied from bge(4) and this is
what the upstream FreeBSD bce(4) driver has done once FreBSD
gained a mii_attach().

ok dlg@ sthen@


Revision tags: OPENBSD_5_3_BASE
# 1.100 13-Jan-2013 brad

Enable flow control support with 5708S/5709S adapters.

ok dlg@


# 1.99 10-Dec-2012 mikeb

Under some circumstances (currently only reproducible with IPsec)
bnx can be left w/o clusters on the receive ring and will stall.
To prevent that schedule a timeout if refill fails. Bug was
reported by jj@, fix tested by me, ok dlg


# 1.98 05-Dec-2012 deraadt

Remove excessive sys/cdefs.h inclusion
ok guenther millert kettenis


Revision tags: OPENBSD_5_2_BASE
# 1.97 05-Jul-2012 phessler

Add flow control to bnx(4)

Tested on 5706, 5708, 5709, 5716 chipsets.

From Brad

OK phessler@, sthen@, mikeb@,


# 1.96 14-May-2012 mikeb

fixup "couldn't establish interrupt" error printf; from brad, ok phessler


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.95 22-Jun-2011 tedu

kill a few more casts that aren't helpful. ok krw miod


# 1.94 18-Apr-2011 dlg

ido not disable interrupts in the isr and then enable them again
when leaving. when you're handling an interrupt it is masked.
whacking the chip is work for no gain.

modify the interrupt handler so it only processes the rings once
rather than looping over them until it runs out of work to do

looping in the isr is bad for several reasons:

firstly, the chip does interrupt mitigation so you have a
decent/predictable amount of work to do in the isr. your first loop
will do that chunk of work (ie, it pulls off 50ish packets), and
then the successive looping aggressively pull one or two packets
off the rx ring. these extra loops work against the benefit that
interrupt mitigation provides.

bus space reads are slow. we should avoid doing them where possible
(but we should always do them when necessary).

doing the loop 5 times per isr works against the mclgeti semantics.
it knows a nic is busy and therefore needs more rx descriptors by
watching to see when the nic uses all of its descriptors between
interrupts. if we're aggressively pulling packets off by looping
in the isr then we're skewing this check.

ok deraadt@


# 1.93 13-Apr-2011 dlg

to quote from the gospel of bus_dma.9:

Synchronization operations are expressed from the perspective of the host
RAM, e.g., a device -> memory operation is a READ and a memory -> device
operation is a WRITE.

the status block that the isr reads is written to by the device.
the chip writes to memory, it is therefore a READ.

this also adds the preread sync when the map is set up and the postread
sync when the map is torn down for better symmetry. there are probably
more issues like this in the code, but this is a start.

discovered while discussing another diff with claudio@


# 1.92 05-Apr-2011 henning

mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUT
ok claudio krw


# 1.91 03-Apr-2011 jasper

use nitems(); no binary change for drivers that are compiled on amd64.

ok claudio@


Revision tags: OPENBSD_4_9_BASE
# 1.90 20-Sep-2010 deraadt

Stop doing shutdown hooks in network drivers where possible. We already
take all interfaces down, via their xxstop routines. Claudio and I have
verified that none of the shutdown hooks do much extra beyond what xxstop
was already doing; it is largely a pile of junk.
ok claudio, some early comments by sthen; also read by matthew, jsg


Revision tags: OPENBSD_4_8_BASE
# 1.89 03-Aug-2010 jsg

Correct use of logical and where binary and was intended.
Spotted by lint, but mirrors a similiar change in the
original FreeBSD code from over a year ago.

ok deraadt@


# 1.88 24-May-2010 sthen

Support fibre PHY on BCM5709S. From FreeBSD via Brad.
Tested by Brad on: BCM5706, BCM5708C
Tested by me on: BCM5716 (BCM5709 PHY)


# 1.87 19-May-2010 oga

BUS_DMA_ZERO instead of alloc, map, bzero.

ok krw@


Revision tags: OPENBSD_4_7_BASE
# 1.86 23-Nov-2009 claudio

bnx(4) is a bit special. The chip itself is capable of swapping endianess
so there is no need for htoleXX calls. The only thing needed is the correct
layout of the DMA-ed structures. Additionally it uses PAGE_SIZE but assumed
that it is always 4k. Fix the macros that failed to respect that so that it
works on 8k PAGE_SIZE systems. This makes bnx(4) work on sparc64.
Tested on amd64 by dlg@. OK dlg@, deraadt@


# 1.85 09-Nov-2009 dlg

Link state change interrupt was not generated due to a missing bit in
the MAC event register.

fix from atte dot peltomaki at iki dot fi
tested by me on 5708 and 5709


# 1.84 13-Aug-2009 jasper

- consistify cfdriver for the ethernet drivers (0 -> NULL)

ok dlg@


# 1.83 09-Aug-2009 deraadt

MCLGETI() will now allocate a mbuf header if it is not provided, thus
reducing the amount of splnet/splx dancing required.. especially in the
worst case (of m_cldrop)
ok dlg kettenis damien


# 1.82 06-Aug-2009 sthen

Add device id for BCM5716S, tidy whitespace. From Brad.


Revision tags: OPENBSD_4_6_BASE
# 1.81 03-Jul-2009 dlg

this is a rather large change to add support for the BCM5709.

the 5709s use a the b09 firmwares, which is different to the b06 used by
all the other chips supported by bnx. the majority of the diff comes from
special handling for some indirect reads and writes, and because it needs
more host memory to operate with.

ive tried to keep the cosmetic changes to a minimum.

"go for it" deraadt@


# 1.80 03-Jul-2009 dlg

newer bnx chips use a separate firmware to the "old" ones. this updates
the b06 firmware for the older chips, and adds the b09 firmware. there are
three variants of the rv2p code thats loaded onto the chips, so this has
been split out into separate firmware files as well.

the driver has been updated to handle the split firmwares, and to easily
allow loading of the different versions. this change only supports the
loading of the firmwares for the currently supported chips.

after this change you must build the new firmwares and install them as well
as your new kernel.

"go to it" deraadt@


# 1.79 20-Jun-2009 naddy

Rewrite the interface flag handling case code and update the receive
filter handling to take advantage of ac_multirangecnt and have correct
IFF_ALLMULTI handling. From Brad.


# 1.78 22-Apr-2009 dlg

dont need to zero the tx pkt pool structure before initting it now that
pool_init does its job properly.


# 1.77 22-Apr-2009 dlg

replace arrays of dmamaps and mbuf pointers used to manage packets
on the tx rings (one mbuf ptr/dmamap array entry was created for
every tx descriptor slot at attach time) with a dynamically grown
list of mbuf pointers and dmamaps.

bnx used to have 512 dmamaps/mbuf pointers for the tx ring, now my
system is running with 8 under moderate load.

the big bonus from this is that the dmamap handling is greatly
simplified.

reyk@ likes this a lot


# 1.76 20-Apr-2009 dlg

when transmitting packets, put the dmamap we used for the packet into the
last descriptor slot in the ring. the tx completion code expects the dmamap
to be there so it can unload it.

ok reyk@


# 1.75 20-Apr-2009 reyk

fix dma map unmapping and unloading in the tx cleanup path.

ok dlg@


# 1.74 14-Apr-2009 kettenis

Don't free an mbuf that's still on the TX queue. While there sanitize the
function signature of bnx_tx_encap() such that people don't get weird ideas
like this again.

ok dlg@


# 1.73 09-Apr-2009 dlg

white space fixes


# 1.72 30-Mar-2009 dlg

switch to MCLGETI.

this conversion is the easiest ive done so far. the mbuf allocation wrapper
in the driver already had code to handle a failing cluster allocator as
part of a test harness, now we test that code all the time with MCLGETI.

ok kettenis@
tested by phessler@


Revision tags: OPENBSD_4_5_BASE
# 1.71 28-Nov-2008 brad

Eliminate the redundant bits of code for MTU and multicast handling
from the individual drivers now that ether_ioctl() handles this.

Shrinks the i386 kernels by..
RAMDISK - 2176 bytes
RAMDISKB - 1504 bytes
RAMDISKC - 736 bytes

Tested by naddy@/okan@/sthen@/brad@/todd@/jmc@ and lots of users.
Build tested on almost all archs by todd@/brad@

ok naddy@


# 1.70 09-Nov-2008 naddy

Introduce bpf_mtap_ether(), which for the benefit of bpf listeners
creates the VLAN encapsulation from the tag stored in the mbuf
header. Idea from FreeBSD, input from claudio@ and canacar@.

Switch all hardware VLAN enabled drivers to the new function.

ok claudio@


# 1.69 19-Oct-2008 brad

Re-add support for RX VLAN tag stripping.


# 1.68 16-Oct-2008 naddy

Switch the existing TX VLAN hardware support over to having the
tag in the header. Convert TX tagging in the drivers.

Help and ok brad@


# 1.67 16-Oct-2008 naddy

Convert RX tag stripping to storing the tag in the mbuf header and
enable RX tag stripping for re(4).

ok brad@


# 1.66 02-Oct-2008 brad

First step towards cleaning up the Ethernet driver ioctl handling.
Move calling ether_ioctl() from the top of the ioctl function, which
at the moment does absolutely nothing, to the default switch case.
Thus allowing drivers to define their own ioctl handlers and then
falling back on ether_ioctl(). The only functional change this results
in at the moment is having all Ethernet drivers returning the proper
errno of ENOTTY instead of EINVAL/ENXIO when encountering unknown
ioctl's.

Shrinks the i386 kernels by..
RAMDISK - 1024 bytes
RAMDISKB - 1120 bytes
RAMDISKC - 832 bytes

Tested by martin@/jsing@/todd@/brad@
Build tested on almost all archs by todd@/brad@

ok jsing@


# 1.65 10-Sep-2008 blambert

Convert timeout_add() calls using multiples of hz to timeout_add_sec()

Really just the low-hanging fruit of (hopefully) forthcoming timeout
conversions.

ok art@, krw@


Revision tags: OPENBSD_4_4_BASE
# 1.64 24-Jun-2008 brad

Fixed a problem that would cause errors (especially when in low memory
systems) because the RX chain was corrupted when an mbuf was mapped to
an unexpected number of buffers.

From davidch@FreeBSD


# 1.63 13-Jun-2008 brad

fix compilation with BNX_DEBUG.


# 1.62 13-Jun-2008 brad

Remove slack space for RX/TX chains since it only covers sloppy coding.

From davidch @ FreeBSD


# 1.61 08-Jun-2008 reyk

don't declare foo_driver_version[] strings and turn them into defines,
nothing uses them and it saves a few bytes in the kernel.

ok claudio@


# 1.60 29-May-2008 brad

- Add a debug message to mention when a 2.5Gb adapter is found.
- Change invalid PHY address debug message in bnx_miibus_write_reg()
from warn level to verbose.
- Add two new softc fields and store the shared and port hw config
data.

From FreeBSD

ok dlg@


# 1.59 23-May-2008 brad

Simplify the combination use of pci_mapreg_type()/pci_mapreg_map() as
suggested by dlg@ awhile ago.

ok dlg@


Revision tags: OPENBSD_4_3_BASE
# 1.58 28-Feb-2008 brad

Add initial bits for fiber support with the BCM5706/BCM5708 chipsets.

Tested with copper adapters by brad@, johan@ and Jung <moorang at gmail dot com>

ok dlg@


# 1.57 22-Feb-2008 kettenis

Avoid unaligned PCI config space access.

ok brad@


# 1.56 17-Feb-2008 brad

Remove the check for non-production bnx(4) chipsets. These chipsets are
not officially "supported" and could have errata which the driver does
not workaround but they should more or less work.

Tested by marco@ with a BCM5708 B0 chipset.

ok marco@ dlg@


# 1.55 25-Nov-2007 dlg

IF_Gbps(2.5) is wrong.

ok claudio@


# 1.54 28-Aug-2007 deraadt

unify firmware load failure messages; ok mglocker


Revision tags: OPENBSD_4_2_BASE
# 1.53 04-Jul-2007 krw

Revert r1.42 of if_bnx.c, "Enable IPv4 transmit TCP/UDP checksum
offload", and associated man page change. To use IPv4 transmit TCP/UDP
checksum offloading you must again define BNX_CSUM.

As requested by mbalmer@ via deraadt@ on suggestion of reyk@ in
response to PR #5437.


# 1.52 22-May-2007 reyk

Add the BCM5709 PCI device Id. It is disabled for now since we do not
support SerDes-based (1000base-SX fibre) bnx(4) devices yet. The
reason is simple - we do not have any fibre bnx(4) to test and port
the SerDes changes from the other bnx drivers.

From brad found in the Linux driver


# 1.51 22-May-2007 jasper

adress -> address

from brad
ok claudio@


# 1.50 22-May-2007 ray

Use BNX_PRINTF instead of printf with missing argument.

OK reyk@, earlier version OK tedu@, dlg@, and miod@.


# 1.49 21-May-2007 reyk

fix bnx vlan tagging in the rx path; do not attach the vlan tag twice
if the firmware has been told to keep it and copy the tag in network
byte order in the other case.

ok mcbride@ dlg@


Revision tags: OPENBSD_4_1_BASE
# 1.48 05-Mar-2007 reyk

remove jumbo frame support by replacing MEXTALLOC with MCLGET, and
simplify the VLAN code.

this will close PR 5356 (system panics under high load).

From claudio@ who is currently not around to commit this fix

tested and ok by mcbride@, reyk@, todd@, Paul Hirsch, and brad


# 1.47 03-Mar-2007 reyk

instead of establishing the interrupt in the mounthook, move it back
to the attach function and set a flag in the mounthook to start
accepting interrupts (there are possible problems with establishing
interrupts after the ioapics are enabled in i386 GENERIC.MP).

also suggested by kettenis
tested by mcbride, me, and some others
ok dlg@


# 1.46 03-Mar-2007 todd

Replacing some spaces with tabs and some typo fixes
from brad@


# 1.45 02-Mar-2007 reyk

oops, this is $OpenBSD$


# 1.44 02-Mar-2007 reyk

- remove the code to bring down the PHY in bnx_stop(), it's wrong
(ifm_data isn't updated) and lead to a panic in mii_phy_setmedia(),
or reading past the end mii_media_table[].
- make sure the dma_map matches the mbuf in the rx structures. We would
sync/unload the wrong map, leading to a DIAGNOSTIC panic, or eventually
leaking memory when bounce buffers are needed.

From NetBSD

ok marco@, brad@


# 1.43 30-Jan-2007 krw

Allow the bnx(4) driver to make use of all of the available hardware
multicast hash slots. The bnx(4) hardware supports 8 slots instead of
4 like the bge(4) hardware.

From Mike Karels via FreeBSD

Tested by Brad, biorn@ and Johan M:son Lindman


# 1.42 27-Jan-2007 krw

Enable transmit TCP/UDP checksum offload.

From Brad, tested by Brad, biorn@ and Johan M:son Lindman.


# 1.41 21-Jan-2007 mcbride

Remove bogus check for old firmware.

Identical fixes from myself and brad@, also reported by chefren@pi.net.


# 1.40 20-Jan-2007 dlg

move the interrupt establishment till after everything in the softc is
set up and allocated (which happens in a mountroothook). this prevents an
early call to the interrupt handler from causing a null deref when trying
to look into the unallocated regions.

found by mcbride when ciss and bnx were sharing an interrupt. mounting
root caused interrupts before the bnx was properly set up.

"commit your fix" mcbride@


# 1.39 19-Jan-2007 mcbride

bnx_init() takes a pointer to sc, not ifp.


# 1.38 10-Jan-2007 deraadt

change firmware byte order to be same on all architectures
THIS MEANS YOU NEED TO UPDATE YOUR FIRMWARE FILE BEFORE BOOTING WITH
A NEW KERNEL
tested by marco, biorn


# 1.37 24-Dec-2006 reyk

use the right size when loading the rx/tx descriptor bus dma maps.

from the NetBSD port

tested by bion@ and others from tech@
ok marco@ brad@


# 1.36 26-Nov-2006 brad

commented out entry for the BCM5709.


# 1.35 20-Nov-2006 brad

only try to do HW checksum offload for TCP and UDP.


# 1.34 20-Nov-2006 brad

Due to an incorrect macro, it appears that the driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.

From scottl@FreeBSD


# 1.33 19-Nov-2006 brad

In bnx_start, check the used_tx_bd count rather than the descriptors
mbuf pointer to see if the transmit ring is full. The mbuf pointer
is set only in the last descriptor of a multi-descriptor packet.
By relying on the mbuf pointers of the earlier descriptors, the
driver would sometimes overwrite a descriptor belonging to a
packet that wasn't completed yet. Also, tx_chain_prod wasn't
updated inside the loop, causing the wrong descriptor to be checked
after the first iteration. The upshot of all this was the loss of
some transmitted packets at medium to high packet rates.

In bnx_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.

Correct a couple of inaccurate comments.

From jdp@FreeBSD


# 1.32 26-Oct-2006 brad

do the minimal initialization of the firmware so that ASF always
works.

From ambrisko@FreeBSD


# 1.31 25-Oct-2006 brad

replace a few more instances of hand rolled code with the
LIST_FOREACH macro.


# 1.30 22-Oct-2006 brad

now with the right revision of this diff which compiles. ok pedro, mglocker.

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.29 21-Oct-2006 deraadt

does not compile


# 1.28 21-Oct-2006 brad

- Ensure that at least 16 TX descriptors are kept unused in the ring.
- Use more complete error handling for TX load problems.

From scottl@FreeBSD


# 1.27 19-Oct-2006 brad

make the exit label naming scheme match the current function names, removes
a FreeBSD-ism from the original driver.


# 1.26 19-Oct-2006 brad

Overhaul the transmit path:
- Eliminate the bnx_dmamap_arg structure.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.

From scottl@FreeBSD


# 1.25 14-Oct-2006 brad

- Simplify the arguments to bnx_tx_encap.
- Don't copy the bd_chain head pointers into temporary objects, they are
available globally.

From scottl@FreeBSD


# 1.24 04-Oct-2006 deraadt

Use loadfirmware(9) to get /etc/firmware/bnx instead of hard-coding a
gigantic firmware into the kernel; checked by brad


Revision tags: OPENBSD_4_0_BASE
# 1.23 25-Aug-2006 brad

don't need to clear if_timer during attach.


# 1.22 21-Aug-2006 deraadt

ramdisks do not have vlan, drop mbuf; ok brad


# 1.21 21-Aug-2006 brad

simplfy code a bit and fix comments, this is the MRU being set not the
MTU.


# 1.20 21-Aug-2006 brad

enable Jumbo support.


# 1.19 20-Aug-2006 brad

remove a comment.


# 1.18 20-Aug-2006 brad

cosmetic tweaks.


# 1.17 20-Aug-2006 brad

- replace IF_DEQUEUE/IF_PREPEND with IFQ_POLL/IFQ_DEQUEUE.
- enable RX checksum offload.
- remove some unused code.


# 1.16 19-Aug-2006 brad

set the capabilities VLAN MTU flag.


# 1.15 14-Aug-2006 marco

And some more KNF.


# 1.14 14-Aug-2006 marco

KNF


# 1.13 14-Aug-2006 marco

More KNF; no functional change.


# 1.12 14-Aug-2006 marco

First in a series of KNF. No functional change.


# 1.11 14-Aug-2006 brad

disable debugging.


# 1.10 14-Aug-2006 marco

Change bus_dmamap_create to use the appropriate values. This fixes the
issues brad was seeing. Help from jason.

ok brad.


# 1.9 13-Aug-2006 marco

Get rid of _HI & _LO macros altogether since they used a wrong idiom.
This was pointed out by mickey The driver now uses the same idiom as mpi.

help from miod, mickey and kettenis

ok brad


# 1.8 13-Aug-2006 brad

fix a typo, BNX_DRBUG -> BNX_DEBUG


# 1.7 10-Aug-2006 brad

unmap memory address space in bnx_release_resources().


# 1.6 10-Aug-2006 brad

cosmetic tweaking.


# 1.5 10-Aug-2006 brad

remove typedef's.


# 1.4 09-Aug-2006 marco

Reorder dmamap & dmamem to match man page.
Redo detection of _LO & _HI macro; help from miod and jordan.
ok beck brad


# 1.3 26-Jun-2006 brad

do not allow a Jumbo size MTU yet.


# 1.2 26-Jun-2006 brad

relocate the firmware per Theo's request.


# 1.1 26-Jun-2006 brad

Add a rough initial port of the bce driver from FreeBSD, which provides
support for the new line of Broadcom NetXtreme II Gigabit PCI-X and PCIe
controllers, though renamed to bnx. This is work in progress, there
are some known issues. With help from Reyk with the bus_dma code.

Thanks to David Christensen at Broadcom for the driver and for providing
some PCI-X and PCIe adapters.

ok deraadt@