History log of /openbsd-current/sys/dev/pci/if_bwfm_pci.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.76 24-May-2024 jsg

remove unneeded includes; ok miod@


Revision tags: OPENBSD_7_3_BASE OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.75 30-Dec-2022 kettenis

Add chip name for new revision of the BCM4378.

ok patrick@


# 1.74 10-Nov-2022 kettenis

We need to turn a few more things on in the resume path. This makes it
possible to ifconfig down the interface suspend/resume and ifconfig up the
interface again afterwards in most cases. Suspend/resume with the interface
up is still busted.

ok patrick@, stsp@


# 1.73 08-Nov-2022 kettenis

Implement alternative mailbox handling mechanism required by newer firmware.

ok patrick@


# 1.72 23-Oct-2022 kettenis

Bump tsleep timeout. For some reason the first attempt to load the firmware
sometimes fails. This happens more often on M2 laptops that also need to
load the touchpad firmware. Smells like we have some sort of thundering herd
at mountroot time which makes this take more time.

ok patrick@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE
# 1.71 21-Mar-2022 kettenis

Reduce dmesg spam by nor printing the "Apple" firmware name.

ok patrick@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.75 30-Dec-2022 kettenis

Add chip name for new revision of the BCM4378.

ok patrick@


# 1.74 10-Nov-2022 kettenis

We need to turn a few more things on in the resume path. This makes it
possible to ifconfig down the interface suspend/resume and ifconfig up the
interface again afterwards in most cases. Suspend/resume with the interface
up is still busted.

ok patrick@, stsp@


# 1.73 08-Nov-2022 kettenis

Implement alternative mailbox handling mechanism required by newer firmware.

ok patrick@


# 1.72 23-Oct-2022 kettenis

Bump tsleep timeout. For some reason the first attempt to load the firmware
sometimes fails. This happens more often on M2 laptops that also need to
load the touchpad firmware. Smells like we have some sort of thundering herd
at mountroot time which makes this take more time.

ok patrick@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE
# 1.71 21-Mar-2022 kettenis

Reduce dmesg spam by nor printing the "Apple" firmware name.

ok patrick@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.74 10-Nov-2022 kettenis

We need to turn a few more things on in the resume path. This makes it
possible to ifconfig down the interface suspend/resume and ifconfig up the
interface again afterwards in most cases. Suspend/resume with the interface
up is still busted.

ok patrick@, stsp@


# 1.73 08-Nov-2022 kettenis

Implement alternative mailbox handling mechanism required by newer firmware.

ok patrick@


# 1.72 23-Oct-2022 kettenis

Bump tsleep timeout. For some reason the first attempt to load the firmware
sometimes fails. This happens more often on M2 laptops that also need to
load the touchpad firmware. Smells like we have some sort of thundering herd
at mountroot time which makes this take more time.

ok patrick@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE
# 1.71 21-Mar-2022 kettenis

Reduce dmesg spam by nor printing the "Apple" firmware name.

ok patrick@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.72 23-Oct-2022 kettenis

Bump tsleep timeout. For some reason the first attempt to load the firmware
sometimes fails. This happens more often on M2 laptops that also need to
load the touchpad firmware. Smells like we have some sort of thundering herd
at mountroot time which makes this take more time.

ok patrick@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE
# 1.71 21-Mar-2022 kettenis

Reduce dmesg spam by nor printing the "Apple" firmware name.

ok patrick@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.71 21-Mar-2022 kettenis

Reduce dmesg spam by nor printing the "Apple" firmware name.

ok patrick@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.70 11-Mar-2022 mpi

Constify struct cfattach.


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.69 06-Mar-2022 kettenis

Look for firmware for Apple Silicon devices in /etc/firmware/apple-bwfm.

ok deraadt@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.68 04-Mar-2022 kettenis

Add support for the BCM4387. The firmware for this variant uses a new scan
command, which is indicated by the "scan_ver" firmware variable.

ok patrick@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.67 02-Mar-2022 kettenis

The firmware for the bwfm(4) variants in Apple Silicon Macs has variants
for different module types, module vendors and module revisions. Make
our driver use the same naming scheme as Asahi Linux.

ok patrick@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.66 01-Jan-2022 patrick

Use correct defines for random seed magic/length.

Spotted by Andreas Schnebinger


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.65 31-Dec-2021 patrick

Newer Apple firmware on chipsets without a hardware RNG require the host to
provide a buffer of random bytes to the device on initialization.


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.64 27-Dec-2021 patrick

Not only BCM4378, but all PCIe core revisions >= 64 need to be accessed
using the new sets of registers.


# 1.63 27-Dec-2021 patrick

Map the chip ids used on Apple M1 Pro/Max and Apple T2 Macs to firmware
names.


# 1.62 27-Dec-2021 patrick

Support reading OTP information from a few more chips, necessary to learn
firmare names on Apple M1 Pro/Max and Apple T2 Macs.


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.61 27-Dec-2021 patrick

Send TxCap and WiFi calibration blobs to the chip.


# 1.60 27-Dec-2021 patrick

Switch module codename retrieval to use the newly proposed device tree
bindings.


# 1.59 27-Dec-2021 patrick

Bump rxpost and rxcomplete ring size to 1024 for newer chips.


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.58 20-Dec-2021 patrick

bus_dmamem_unmap() should not be called from interrupt context, so free
and close flowrings using bwfm_do_async().

Reported by and ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.57 23-Oct-2021 kettenis

Make sure we have enough space to add padding and final token to the nvram
data. Also add the MAC address to the nvram data when there is a
"local-mac-address" property in the device tree. This makes bwfm(4) work
with the firmware/nvram/clm_blob files provided with MacOS on the Apple
M1 Macs.

ok patrick@


Revision tags: OPENBSD_7_0_BASE
# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.56 31-Aug-2021 patrick

Implement suspend/resume for bwfm(4) with PCIe backend. We try to send the
device into D3 and do a hot-resume if possible. Otherwise we need to clean
up the resources to allow complete HW re-initialization to take place.


# 1.55 31-Aug-2021 patrick

Properly deallocate some more structures upon detach, and make sure we're
not considered initialized anymore.


# 1.54 31-Aug-2021 patrick

Initialize some struct variables to make sure that upon reinit, caused by
a suspend/resume cycle, the values are set to a sane default.


# 1.53 31-Aug-2021 patrick

Initialize ring read/write pointers to make sure that upon reinit, caused
by a suspend/resume cycle, the pointers are set to a sane default.


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.52 22-Jun-2021 patrick

bwfm(4) on PCI isn't really MPSAFE, and I'm not sure how this flag
even got there in the first place. I've been wondering why I have
seen a bit of mbuf corruption here and there since I put the bwfm(4)
M.2 PCIe card into my arm64 machine. Well, duh.


Revision tags: OPENBSD_6_9_BASE
# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.51 26-Feb-2021 patrick

Read and parse OTP on the BCM4378. There are quite a few firmware and
nvram files used for the different Apple devices. The device tree and
the OTP hold the information which of those we will have to use. For
now this information will simply be printed, but depending on how we
choose to do the firmare distribution we could use it for loadfirmware().


# 1.50 26-Feb-2021 patrick

Attach to BCM4378.


# 1.49 26-Feb-2021 patrick

Add support for BCM4378 as implemented on the Apple M1. This chip seems
to use a different set of PCIE2REG registers. Accessing the "old" ones
even leads to faults. There are two surprises though. One is that it
seems that the interrupt status register always returns 0, and the other
one is that we receive the interrupts way too early, but both can be
worked around for now.


# 1.48 26-Feb-2021 patrick

Increase the amount of RX buffers given to the bwfm(4) chip. We haave seen
this already on previous chips, which only started giving us packets when
handing over at least 128 of them. Apparently some now require 256, which
seems to get the Apple M1's WiFi going.


# 1.47 26-Feb-2021 patrick

Increase the buffer size for the ioctl response buffers to the same as
used in the wifi firmware to ensure responses can be received.


# 1.46 26-Feb-2021 patrick

Indicate hostready signal to inform the firmware that the rings have been
initialized.


# 1.45 26-Feb-2021 patrick

Refactor bwfm(4) firmware loading. The PCIe backend will need to be able
to load the CLM blob like the SDIO backend already does. Additionally it
is also helpful for the PCIe backend to try a file named after the device
tree compatible. Thus refactor the SDIO code and make it available for
both SDIO and PCIe.


# 1.44 26-Feb-2021 patrick

Fix prio2fifo mapping table.


# 1.43 25-Feb-2021 patrick

The firmware replaces the last 32-bit on RAM with a shared DRAM address.
While the for-loop checks that thie value has changed since we wrote to
it, the timeout-condition checked for non-zero, which is wrong. This
means that we didn't realize the firmware wasn't started. While there,
make sure the shared DRAM address is inside the chip's address space.


# 1.42 25-Feb-2021 patrick

Some newer chips have two D11/802.11 cores, and we need to reset both at
the same time.


# 1.41 25-Feb-2021 patrick

Support for version 7 of the bwfm(4) PCIe interface. The size of the items
on the rx/tx complete rings has increased slightly to accomodate possible
new features.


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.40 25-Feb-2021 dlg

we don't have to cast to caddr_t when calling m_copydata anymore.

the first cut of this diff was made with coccinelle using this spatch:

@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)

i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.

ok deraadt@ bluhm@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.39 31-Jan-2021 patrick

Add basic support for BCM4378 as found on the Apple M1 SoCs. There's a
little bit more to do though before it can be enabled.


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.38 12-Dec-2020 jan

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@


Revision tags: OPENBSD_6_8_BASE
# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.37 22-Jun-2020 dlg

use ifiq_input and use it's return value to apply backpressure to rxrs.

this is a step toward deprecating softclock based livelock detection.


Revision tags: OPENBSD_6_7_BASE
# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.36 07-Mar-2020 patrick

Use snprintf(9) to create the names for the firmware and NVRAM files. This
reduces the amount of duplicated lines per chip, and allows us to ship per-
board files in the future.

Based on a diff from jsg@
ok kurt@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.35 06-Mar-2020 patrick

Process the NVRAM in bwfm(4) itself. So far we have relied on some
external tool to pre-process the NVRAM, even though it's simple to
do ourselves. This allows easier firmware distribution, especially
since on some x86 machines the NVRAM is stored in an EFI variable.


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.34 25-Feb-2020 patrick

Make bwfm(4) call if_input() only once per interrupt.

This reduces drops caused by the ifq pressure drop mechanism and hence
increases throughput.

ok tobhe@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.33 15-Jan-2020 patrick

Sprinkle splnet() around the ringbuffer accesses, otherwise the
task and interrupt try to concurrently submit messages on the
control ring.


# 1.32 15-Jan-2020 patrick

Some PCIe firmwares drop TX packets when the pktid is 0. Add
an offset to make sure they start from 1.


# 1.31 15-Jan-2020 patrick

Fix off-by-one in ringbuffer code. When we insert items faster than
the hardware is processing them, the write index can catch up to the
read index. We must make sure that our write index stays smaller
than the hardware's read index, thus the difference between both has
to be bigger than 1.

ok tobhe@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.30 09-Jan-2020 mpi

Convert sleeps of 1sec or more to tsleep_nsec(9).

ok bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.29 07-Feb-2019 patrick

Consistently use m_freem(9). This fixes possible leaks in a few
error cases.


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.28 17-Jan-2019 mlarkin

Enable bwfm(4) in RAMDISK_CD

ok deraadt


Revision tags: OPENBSD_6_4_BASE
# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.27 20-Aug-2018 patrick

Attach bwfm(4) to Broadcom BCM4371.

ok kettenis@


# 1.26 25-Jul-2018 patrick

Implement a MSGBUF control packet mechanism based on the command
request ids. So far we were only able to have one command in flight
at a time and race conditions could easily lead to unexpected
behaviour. With this rework we send and enqueue a control packet
command and wait for replies to happen. Thus we can have multiple
control packets in flight and a reply with the correct id will wake
us up.


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.25 06-Jul-2018 patrick

Add bus_dmamap_sync(9) calls to bwfm(4) so that we make sure the data
is synced properly before the CPU or the WiFi chip access the supplied
memory. Makes PCIe-connected bwfm(4) work on ARM-based machines.


# 1.24 05-Jul-2018 patrick

Cast physical addresses to 64-bits so we can shift them by 32-bit on
32-bit platforms without the compiler complaining. In the end the
value will turn out as 0 anyway. Allows enabling bwfm(4) on 32-bit
platforms.

ok stsp@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.23 07-Jun-2018 patrick

Attach bwfm(4) to the Broadcom 4356 found in the GPD Pocket.

Tested by mlarkin@


# 1.22 07-Jun-2018 patrick

Some PCIe-based bwfm(4) chips also require that we supply an NVRAM
binary. In case we have an (optional) NVRAM binary, copy it to the
end of the chip's memory.

Tested by mlarkin@ on his GPD Pocket.


# 1.21 23-May-2018 patrick

Implement a separate initialization stage so that we can still use
and initialize bwfm(4) later in the case that the firmware was not
available on bootup and was only later installed.

ok stsp@


# 1.20 23-May-2018 patrick

Map the second bwfm(4) BAR first. The bwfm(4) PCIe devices have two
BARs, where the second one is much larger than the first. Both need
to be properly aligned in the given extent. Since the first one is
smaller, it will "unalign" the next free space and thus create a gap
so that the second BAR cannot be properly aligned in the given space.
By mapping the second BAR first, it will automatically have proper
alignment. The first BAR, which has fewer alignment requirements,
fits well after the initial allocation. Fixes bwfm(4) on APU 1.

Debugged and solved by kettenis@


# 1.19 16-May-2018 patrick

Implement a BCDC control packet mechanism based on the command request
ids. So far we were only able to have one command in flight at a time
and race conditions could easily lead to unexpected behaviour, especia-
lly combined with a slow bus and timeouts. With this rework we send or
enqueue a control packet command and wait for replies to happen. Thus
we can have multiple control packets in flight and a reply with the
correct id will wake us up.


Revision tags: OPENBSD_6_3_BASE
# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.18 08-Feb-2018 patrick

Move bwfm(4) from ifq begin/commit/rollback semantics to the newer
ifq dequeue semantics. This basically means we need to check for
available space before dequeuing a packet. As soon as we dequeue
a packet we commit to it. On the PCIe backend this check can not
be done easily since the flowring depends on the packet contents and
we cannot take a peek. When there is no flowring we cache the mbuf
and send it out as soon as the flowring opened up. Then the ifq can
be restarted and traffic can flow. Typically we usually run out of
packet ids, which can be checked without consulting the packet. The
flowring probably never becomes full as the bwfm(4) firmware takes
the packets off the ring without actually sending them out.

Discussed with dlg@


# 1.17 07-Feb-2018 patrick

Move parsing the BCDC header on RX into a protocol specific RX
function so it can be shared with the SDIO attachment driver.


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@


# 1.16 11-Jan-2018 patrick

The PCI bwfm(4) chips have no TX rings in the traditional sense, as on
the actual rings we only share messages. Sending a TX packet means
putting a message on the ring which contains a pktid (which for us maps
to an mbuf) and the physical address of the mbuf. On jcs@'s macbook he
seems to run out of TX pktids pretty quickly during a speedtest. This
would mean that there are 2048 TX packets in flight that we either want
to send out or that have not been "acked" by the firmware yet. Either
way, recover from that situation when we hit that arbitrary limit by
restarting the queue after we free'd a packet from the TX pktid list.

Tested by jcs@


# 1.15 10-Jan-2018 jcs

Attach bwfm to the Broadcom 4350 found in the 2017 MacBook.

Easily handles >150Mbps transfers through a 5Ghz AP.

ok patrick

(Committed via bwfm0, of course)


# 1.14 10-Jan-2018 patrick

Add firmware names for the two revisions of the Broadcom 4350 as seen
on a MacBook 12-inch (2017).

Tested by and with jcs@


# 1.13 10-Jan-2018 patrick

Don't reset the internal memory core on chips other than the Broadcom
43602, as it's only necessary on that specific chip.

Found the hard way by jcs@ on a MacBook 12-inch (2017)


# 1.12 10-Jan-2018 patrick

Move line for readability.


# 1.11 08-Jan-2018 patrick

In AP mode multicast packets share the flowrings with broadcast
packets.


# 1.10 08-Jan-2018 patrick

The bwfm(4) TX ring expects the ethernet header as part of the TX info
struct. The data length is the length of the frame without the header.
In the previous version m_adj(9) is used, but since that was changed we
need to decrease the length ourselves.


# 1.9 08-Jan-2018 patrick

Guard the debug printf function behind BWFM_DEBUG as well. Also only
print the firmware's dmesg(8) if we're running with a higher debug
mode.

Prompted by Michael W. Bombardieri


# 1.8 08-Jan-2018 patrick

Delete flowrings when we take the interface down or change its
settings.


# 1.7 07-Jan-2018 patrick

Create multiple transmit flowrings in station mode, four in total, based
on TOS values. In AP mode create multiple flowrings per connected node.


# 1.6 05-Jan-2018 patrick

To send out packets we need to create a flowring. Acting as station,
we typically have about four flowrings per priority. As access point
we apparently need one, or four considering the priorities, flowrings
per client. For now let's start with a single TX flowring. To setup
a flowring we need to send a create request and can only start sending
packets as soon as we are told that the ring is created. With this we
can now do actual network traffic.


# 1.5 03-Jan-2018 patrick

Since the PCI attachment code already uses mbufs for RX packets, we can
push the mbuf allocation down into the USB attachment code and now pass
an mbuf to the bwfm(4) receive function.


# 1.4 03-Jan-2018 patrick

Add size for free(9) in the bwfm(4) PCI attachment code.

From Michael W. Bombardieri


# 1.3 01-Jan-2018 patrick

For whatever reason the firmware needs more RX buffers available as
we typically use, which unfortunately creates a bigger memory foot-
print. With this the receive path can be made to work.


# 1.2 01-Jan-2018 patrick

Put the code that prints the firmware's debug console into a function
so we can read and print the messages printed by the firmware when we
are debugging the driver.


# 1.1 24-Dec-2017 patrick

Add a PCI attachment driver for bwfm(4). It's not finished, but it's
already past the point where development can occur out of the tree.
With this I can successfully scan for access points and tell the chip
to attach to an SSID. RX path should work as well, but since I forgot
to bring the antenna with me to my parents, the reception is a bit
horrible in the metal enclosure.

There are a few reasons this driver is rather big. First we set up the
ARM Cores, uploading the firmware and kicking it off. Then we need to
read all needed information from the registers. Once that is done we
have to set up countless buffers. There are 2 TX rings and 3 RX rings,
plus N TX rings for the actual data that is yet to be implemented.

Merry Christmas!

ok kettenis@