History log of /freebsd-11-stable/sys/dev/ath/if_athvar.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 331722 29-Mar-2018 eadler

Revert r330897:

This was intended to be a non-functional change. It wasn't. The commit
message was thus wrong. In addition it broke arm, and merged crypto
related code.

Revert with prejudice.

This revert skips files touched in r316370 since that commit was since
MFCed. This revert also skips files that require $FreeBSD$ property
changes.

Thank you to those who helped me get out of this mess including but not
limited to gonzo, kevans, rgrimes.

Requested by: gjb (re)


# 330897 14-Mar-2018 eadler

Partial merge of the SPDX changes

These changes are incomplete but are making it difficult
to determine what other changes can/should be merged.

No objections from: pfg


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 302100 22-Jun-2016 adrian

[ath] fix comments!

I keep asking myself "what do these fields mean" and so now I've clarified
it for myself.

Tested:

* Reading the comments, going "a-ha!" a couple times.

Approved by: re (gjb)


# 301181 01-Jun-2016 adrian

[ath] commit initial bluetooth coexistence support for the MCI NICs.

This is the initial framework to call into the MCI HAL routines and drive
the basic state engine.

The MCI bluetooth coex model uses a command channel between wlan and
bluetooth, rather than a 2-wire or 3-wire signaling protocol to control things.
This means the wlan and bluetooth chip exchange a lot more information and
signaling, even at the per-packet level. The NICs in question can share
the input LNA and output PA on the die, so they absolutely can't stomp
on each other in a silly fashion. It also allows for the bluetooth side
to signal when profiles come and go, so the driver can take appropriate
control. There's also the possibility of dynamic bluetooth/wlan duty cycle
control which I haven't yet really played with.

It configures things up with a static "wlan wins everything" coexistence,
configures up the available 2GHz channel map for bluetooth, sets a static
duty cycle for bluetooth/wifi traffic priority and drives the basics needed to
keep the MCI HAL code happy.

It doesn't do any actual coexistence except to default to "wlan wins everything",
which at least demonstrates that things do indeed work. Bluetooth inquiry frames
still trump wifi (including beacons), so that demonstrates things really do
indeed seem to work.

Tested:

* AR9462 (WB222), STA mode + bt
* QCA9565 (WB335), STA mode + bt

TODO:

* .. the rest of coexistence. yes, bluetooth, not people. That stuff's hard.
* It doesn't do the initial BT side calibration, which requires a WLAN chip
reset. I'll fix up the reset path a bit more first before I enable that.
* The 1-ant and 2-ant configuration bits aren't being set correctly in
if_ath_btcoex.c - I'll dig into that and fix it in a subsequent commit.
* It's not enabled by default for WB222/WB225 even though I believe it now
can be - I'll chase that up in a subsequent commit.

Obtained from: Qualcomm Atheros, Linux ath9k


# 298939 02-May-2016 pfg

dev/ath: minor spelling fixes in comments.

No functional change.

Reviewed by: adrian


# 298608 25-Apr-2016 adrian

[ath] add LDPC capability support and LDPC RX support.

This enables LDPC receive support for the AR9300 chips that support it.
It'll announce LDPC support via net80211.

Tested:

* AR9380, STA mode
* AR9331, (to verify the HAL didn't attach it to a chip which
doesn't support LDPC.)

TODO:

* Add in net80211 machinery to make this configurable at runtime.


# 290612 09-Nov-2015 adrian

ath(4): begin fleshing out a "reset type" extension to force cold/warn resets.

Right now the only way to force a cold reset is:

* The HAL itself detects it's needed, or
* The sysctl, setting all resets to be cold.

Trouble is, cold resets take quite a bit longer than warm resets.

However, there are situations where a cold reset would be nice.
Specifically, after a stuck beacon, BB/MAC hang, stuck calibration results,
etc.

The vendor HAL has a separate method to set the reset reason (which is
how HAL_RESET_BBPANIC gets set) which informs the HAL during the reset path
why it occured. This is almost but not quite the same; I may eventually
unify both approaches in the future.

This commit just extends HAL_RESET_TYPE to include both status (eg BBPANIC)
and type (eg do COLD.) None of the HAL code uses it yet though; that'll
come later.

It also is a big no-op in each HAL - I need to go teach each of the HALs
about cold/warm reset through this path.


# 290474 06-Nov-2015 adrian

ath(4) - reflect whether this is a full or fast channel change.

It's no longer "outdoor."


# 288349 29-Sep-2015 adrian

Remove the references to the TX IC lock - i ended up solving this
using net80211 to seralise encap+xmit, so now it's a non-issue.


# 288095 22-Sep-2015 adrian

net80211: include one copy of struct ieee80211_beacon_offsets into ieee80211vap

Submitted by: Andriy Voskoboinyk <s3erios@gmail.com>
Differential Revision: https://reviews.freebsd.org/D3658


# 287197 27-Aug-2015 glebius

Replay r286410. Change KPI of how device drivers that provide wireless
connectivity interact with the net80211 stack.

Historical background: originally wireless devices created an interface,
just like Ethernet devices do. Name of an interface matched the name of
the driver that created. Later, wlan(4) layer was introduced, and the
wlanX interfaces become the actual interface, leaving original ones as
"a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer
and a driver became a mix of methods that pass a pointer to struct ifnet
as identifier and methods that pass pointer to struct ieee80211com. From
user point of view, the parent interface just hangs on in the ifconfig
list, and user can't do anything useful with it.

Now, the struct ifnet goes away. The struct ieee80211com is the only
KPI between a device driver and net80211. Details:

- The struct ieee80211com is embedded into drivers softc.
- Packets are sent via new ic_transmit method, which is very much like
the previous if_transmit.
- Bringing parent up/down is done via new ic_parent method, which notifies
driver about any changes: number of wlan(4) interfaces, number of them
in promisc or allmulti state.
- Device specific ioctls (if any) are received on new ic_ioctl method.
- Packets/errors accounting are done by the stack. In certain cases, when
driver experiences errors and can not attribute them to any specific
interface, driver updates ic_oerrors or ic_ierrors counters.

Details on interface configuration with new world order:
- A sequence of commands needed to bring up wireless DOESN"T change.
- /etc/rc.conf parameters DON'T change.
- List of devices that can be used to create wlan(4) interfaces is
now provided by net.wlan.devices sysctl.

Most drivers in this change were converted by me, except of wpi(4),
that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing
changes to at least 8 drivers. Thanks to pluknet@, Oliver Hartmann,
Olivier Cochard, gjb@, mmoll@, op@ and lev@, who also participated in
testing.

Reviewed by: adrian
Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 286437 07-Aug-2015 adrian

Revert the wifi ifnet changes until things are more baked and tested.

* 286410
* 286413
* 286416

The initial commit broke a variety of debug and features that aren't
in the GENERIC kernels but are enabled in other platforms.


# 286410 07-Aug-2015 glebius

Change KPI of how device drivers that provide wireless connectivity interact
with the net80211 stack.

Historical background: originally wireless devices created an interface,
just like Ethernet devices do. Name of an interface matched the name of
the driver that created. Later, wlan(4) layer was introduced, and the
wlanX interfaces become the actual interface, leaving original ones as
"a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer
and a driver became a mix of methods that pass a pointer to struct ifnet
as identifier and methods that pass pointer to struct ieee80211com. From
user point of view, the parent interface just hangs on in the ifconfig
list, and user can't do anything useful with it.

Now, the struct ifnet goes away. The struct ieee80211com is the only
KPI between a device driver and net80211. Details:

- The struct ieee80211com is embedded into drivers softc.
- Packets are sent via new ic_transmit method, which is very much like
the previous if_transmit.
- Bringing parent up/down is done via new ic_parent method, which notifies
driver about any changes: number of wlan(4) interfaces, number of them
in promisc or allmulti state.
- Device specific ioctls (if any) are received on new ic_ioctl method.
- Packets/errors accounting are done by the stack. In certain cases, when
driver experiences errors and can not attribute them to any specific
interface, driver updates ic_oerrors or ic_ierrors counters.

Details on interface configuration with new world order:
- A sequence of commands needed to bring up wireless DOESN"T change.
- /etc/rc.conf parameters DON'T change.
- List of devices that can be used to create wlan(4) interfaces is
now provided by net.wlan.devices sysctl.

Most drivers in this change were converted by me, except of wpi(4),
that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing
changes to at least 8 drivers. Thanks to Olivier Cochard, gjb@, mmoll@,
op@ and lev@, who also participated in testing. Details here:

https://wiki.freebsd.org/projects/ifnet/net80211

Still, drivers: ndis, wtap, mwl, ipw, bwn, wi, upgt, uath were not
tested. Changes to mwl, ipw, bwn, wi, upgt are trivial and chances
of problems are low. The wtap wasn't compilable even before this change.
But the ndis driver is complex, and it is likely to be broken with this
commit. Help with testing and debugging it is appreciated.

Differential Revision: D2655, D2740
Sponsored by: Nginx, Inc.
Sponsored by: Netflix


# 283535 25-May-2015 adrian

Begin plumbing ieee80211_rx_stats through the receive path.

Smart NICs with firmware (eg wpi, iwn, the new atheros parts, the intel 7260
series, etc) support doing a lot of things in firmware. This includes but
isn't limited to things like scanning, sending probe requests and receiving
probe responses. However, net80211 doesn't know about any of this - it still
drives the whole scan/probe infrastructure itself.

In order to move towards suppoting smart NICs, the receive path needs to
know about the channel/details for each received packet. In at least
the iwn and 7260 firmware (and I believe wpi, but I haven't tried it yet)
it will do the scanning, power-save and off-channel buffering for you -
all you need to do is handle receiving beacons and probe responses on
channels that aren't what you're currently on. However the whole receive
path is peppered with ic->ic_curchan and manual scan/powersave handling.
The beacon parsing code also checks ic->ic_curchan to determine if the
received beacon is on the correct channel or not.[1]

So:

* add freq/ieee values to ieee80211_rx_stats;
* change ieee80211_parse_beacon() to accept the 'current' channel
as an argument;
* modify the iv_input() and iv_recv_mgmt() methods to include the rx_stats;
* add a new method - ieee80211_lookup_channel_rxstats() - that looks up
a channel based on the contents of ieee80211_rx_stats;
* if it exists, use it in the mgmt path to switch the current channel
(which still defaults to ic->ic_curchan) over to something determined
by rx_stats.

This is enough to kick-start scan offload support in the Intel 7260
driver that Rui/I are working on. It also is a good start for scan
offload support for a handful of existing NICs (wpi, iwn, some USB
parts) and it'll very likely dramatically improve stability/performance
there. It's not the whole thing - notably, we don't need to do powersave,
we should not scan all channels, and we should leave probe request sending
to the firmware and not do it ourselves. But, this allows for continued
development on the above features whilst actually having a somewhat
working NIC.

TODO:

* Finish tidying up how the net80211 input path works.
Right now ieee80211_input / ieee80211_input_all act as the top-level
that everything feeds into; it should change so the MIMO input routines
are those and the legacy routines are phased out.

* The band selection should be done by the driver, not by the net80211
layer.

* ieee80211_lookup_channel_rxstats() only determines 11b or 11g channels
for now - this is enough for scanning, but not 100% true in all cases.
If we ever need to handle off-channel scan support for things like
static-40MHz or static-80MHz, or turbo-G, or half/quarter rates,
then we should extend this.

[1] This is a side effect of frequency-hopping and CCK modes - you
can receive beacons when you think you're on a different channel.
In particular, CCK (which is used by the low 11b rates, eg beacons!)
is decodable from adjacent channels - just at a low SNR.
FH is a side effect of having the hardware/firmware do the frequency
hopping - it may pick up beacons transmitted from other FH networks
that are in a different phase of hopping frequencies.


# 272292 30-Sep-2014 adrian

Add initial support for the AR9485 CUS198 / CUS230 variants.

These variants have a few differences from the default AR9485 NIC,
namely:

* a non-default antenna switch config;
* slightly different RX gain table setup;
* an external XLNA hooked up to a GPIO pin;
* (and not yet done) RSSI threshold differences when
doing slow diversity.

To make this possible:

* Add the PCI device list from Linux ath9k, complete with vendor and
sub-vendor IDs for various things to be enabled;
* .. and until FreeBSD learns about a PCI device list like this,
write a search function inspired by the USB device enumeration code;
* add HAL_OPS_CONFIG to the HAL attach methods; the HAL can use this
to initialise its local driver parameters upon attach;
* copy these parameters over in the AR9300 HAL;
* don't default to override the antenna switch - only do it for
the chips that require it;
* I brought over ar9300_attenuation_apply() from ath9k which is cleaner
and easier to read for this particular NIC.

This is a work in progress. I'm worried that there's some post-AR9380
NIC out there which doesn't work without the antenna override set as
I currently haven't implemented bluetooth coexistence for the AR9380
and later HAL. But I'd rather have this code in the tree and fix it
up before 11.0-RELEASE happens versus having a set of newer NICs
in laptops be effectively RX deaf.

Tested:

* AR9380 (STA)
* AR9485 CUS198 (STA)

Obtained from: Qualcomm Atheros, Linux ath9k


# 271887 19-Sep-2014 adrian

Fix up the EDMA RX setup path to correctly initialise and reset the RX FIFO.

The original code was .. well, slightly more than incorrect.

It showed up as stalled RX queues if the NIC needed to be frequently
reinitialised (eg during scans.)

This is inspired by work done by Matt Dillon over at the DragonflyBSD
project.

So:

* track when EDMA RX has been stopped and when the MAC has been reset;
* re-initialise the ring only after a reset;
* track whether RX has been stopped/started - just for debugging now;
* don't bother with the RX EOL stuff for EDMA - we don't need the
interrupt at all. We also don't need to disable/enable the interrupt
or start DMA - once new frames are pushed into the ring via the
normal RX path, it'll just restart RX DMA on its own.

Tested:

* AR9380, STA mode
* AR9380, AP mode
* AR9485, STA mode
* AR9462, STA mode


# 265409 05-May-2014 adrian

Modify the RX path to keep the previous RX descriptor around once it's
used.

It turns out that the RX DMA engine does the same last-descriptor-link-
pointer-re-reading trick that the TX DMA engine. That is, the hardware
re-reads the link pointer before it moves onto the next descriptor.
Thus we can't free a descriptor before we move on; it's possible the
hardware will need to re-read the link pointer before we overwrite
it with a new one.

Tested:

* AR5416, STA mode

TODO:

* more thorough AP and STA mode testing!
* test on other pre-AR9380 NICs, just to be sure.
* Break out the RX descriptor grabbing bits from the RX completion
bits, like what is done in the RX EDMA code, so ..
* .. the RX lock can be held during ath_rx_proc(), but not across
packet input.


# 265205 01-May-2014 adrian

Add tracking for self-generated frames when the VAP is in sleep state.

The hardware can generate its own frames (eg RTS/CTS exchanges, other
kinds of 802.11 management stuff, especially when it comes to 802.11n)
and these also have PWRMGT flags. So if the VAP is asleep but the
NIC is in force-awake for some reason, ensure that the self-generated
frames have PWRMGT set to 1.

Now, this (like basically everything to do with powersave) is still
racy - the only way to guarantee that it's all actually consistent
is to pause transmit and let it finish before transitioning the VAP
to sleep, but this at least gets the basic method of tracking and
updating the state debugged.

Tested:

* AR5416, STA mode
* AR9380, STA mode


# 265115 30-Apr-2014 adrian

Bring over some initial power save management support, reset path
fixes and beacon programming / debugging into the ath(4) driver.

The basic power save tracking:

* Add some new code to track the current desired powersave state; and
* Add some reference count tracking so we know when the NIC is awake; then
* Add code in all the points where we're about to touch the hardware and
push it to force-wake.

Then, how things are moved into power save:

* Only move into network-sleep during a RUN->SLEEP transition;
* Force wake the hardware up everywhere that we're about to touch
the hardware.

The net80211 stack takes care of doing RUN<->SLEEP<->(other) state
transitions so we don't have to do it in the driver.

Next, when to wake things up:

* In short - everywhere we touch the hardware.
* The hardware will take care of staying awake if things are queued
in the transmit queue(s); it'll then transit down to sleep if
there's nothing left. This way we don't have to track the
software / hardware transmit queue(s) and keep the hardware
awake for those.

Then, some transmit path fixes that aren't related but useful:

* Force EAPOL frames to go out at the lowest rate. This improves
reliability during the encryption handshake after 802.11
negotiation.

Next, some reset path fixes!

* Fix the overlap between reset and transmit pause so we don't
transmit frames during a reset.
* Some noisy environments will end up taking a lot longer to reset
than normal, so extend the reset period and drop the raise the
reset interval to be more realistic and give the hardware some
time to finish calibration.
* Skip calibration during the reset path. Tsk!

Then, beacon fixes in station mode!

* Add a _lot_ more debugging in the station beacon reset path.
This is all quite fluid right now.
* Modify the STA beacon programming code to try and take
the TU gap between desired TSF and the target TU into
account. (Lifted from QCA.)

Tested:

* AR5210
* AR5211
* AR5212
* AR5413
* AR5416
* AR9280
* AR9285

TODO:

* More AP, IBSS, mesh, TDMA testing
* Thorough AR9380 and later testing!
* AR9160 and AR9287 testing

Obtained from: QCA


# 251655 12-Jun-2013 adrian

Migrate the LNA mixing diversity machinery from the AR9285 HAL to the driver.

The AR9485 chip and AR933x SoC both implement LNA diversity.
There are a few extra things that need to happen before this can be
flipped on for those chips (mostly to do with setting up the different
bias values and LNA1/LNA2 RSSI differences) but the first stage is
putting this code into the driver layer so it can be reused.

This has the added benefit of making it easier to expose configuration
options and diagnostic information via the ioctl API. That's not yet
being done but it sure would be nice to do so.

Tested:

* AR9285, with LNA diversity enabled
* AR9285, with LNA diversity disabled in EEPROM


# 251484 07-Jun-2013 adrian

Add accessor macros for the bluetooth coexistence routines.


# 251401 04-Jun-2013 adrian

Implement a bit of a hack to store the AR9285/AR9485 RX LNA configuration in
the RX antenna field.

The AR9285/AR9485 use an LNA mixer to determine how to combine the signals
from the two antennas. This is encoded in the RSSI fields (ctl/ext) for
chain 2. So, let's use that here.

This maps RX antennas 0->3 to the RX mixer configuration used to
receive a frame. There's more that can be done but this is good enough
to diagnose if the hardware is doing "odd" things like trying to
receive frames on LNA2 (ie, antenna 2 or "alt" antenna) when there's
only one antenna connected.

Tested:

* AR9285, STA mode


# 251014 26-May-2013 adrian

Migrate ath(4) to now use if_transmit instead of the legacy if_start
and if queue mechanism; also fix up (non-11n) TX fragment handling.

This may result in a bit of a performance drop for now but I plan on
debugging and resolving this at a later stage.

Whilst here, fix the transmit path so fragment transmission works.

The TX fragmentation handling is a bit more special. In order to
correctly transmit TX fragments, there's a bunch of corner cases that
need to be handled:

* They must be transmitted back to back, in the same order..
* .. ie, you need to hold the TX lock whilst transmitting this
set of fragments rather than interleaving it with other MSDUs
destined to other nodes;
* The length of the next fragment is required when transmitting, in
order to correctly set the NAV field in the current frame to the
length of the next frame; which requires ..
* .. that we know the transmit duration of the next frame, which ..
* .. requires us to set the rate of all fragments to the same length,
or make the decision up-front, etc.

To facilitate this, I've added a new ath_buf field to describe the
length of the next fragment. This avoids having to keep the mbuf
chain together. This used to work before my 11n TX path work because
the ath_tx_start() routine would be handed a single mbuf with m_nextpkt
pointing to the next frame, and that would be maintained all the way
up to when the duration calculation was done. This doesn't hold
true any longer - the actual queuing may occur at any point in the
future (think ath_node TID software queuing) so this information
needs to be maintained.

Right now this does work for non-11n frames but it doesn't at all
enforce the same rate control decision for all frames in the fragment.
I plan on fixing this in a followup commit.

RTS/CTS has the same issue, I'll look at fixing this in a subsequent
commit.

Finaly, 11n fragment support requires the driver to have fully
decided what the rate scenario setup is - including 20/40MHz,
short/long GI, STBC, LDPC, number of streams, etc. Right now that
decision is (currently) made _after_ the NAV field value is updated.
I'll fix all of this in subsequent commits.

Tested:

* AR5416, STA, transmitting 11abg fragments
* AR5416, STA, 11n fragments work but the NAV field is incorrect for
the reasons above.

TODO:

* It would be nice to be able to queue mbufs per-node and per-TID so
we can only queue ath_buf entries when it's time to assemble frames
to send to the hardware.

But honestly, we should just do that level of software queue management
in net80211 rather than ath(4), so I'm going to leave this alone for now.

* More thorough AP, mesh and adhoc testing.

* Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has
been negotiated, as we can't do software retransmission of fragments.

* .. set CLRDMASK when transmitting fragments, just to ensure.


# 250866 21-May-2013 adrian

Implement a separate hardware queue threshold for aggregate and non-aggr
traffic.

When transmitting non-aggregate traffic, we need to keep the hardware
busy whilst transmitting or small bursts in txdone/tx latency will
kill us.

This restores non-aggregate iperf performance, especially when doing
TDMA.

Tested:

* AR5416<->AR5416, TDMA
* AR5416 STA <-> AR9280 AP


# 250865 21-May-2013 adrian

Enable the use of TDMA on an 802.11n channel (with aggregation disabled,
of course.)

There's a few things that needed to happen:

* In case someone decides to set the beacon transmission rate to be
at an MCS rate, use the MCS-aware version of the duration calculation
to figure out how long the received beacon frame was.

* If TxOP enforcing is available on the hardware and we're doing TDMA,
enable it after a reset and set the TDMA guard interval to zero.
This seems to behave fine.

TODO:

* Although I haven't yet seen packet loss, the PHY errors that would be
triggered (specifically Transmit-Override-Receive) aren't enabled
by the 11n HAL. I'll have to do some work to enable these PHY errors
for debugging.

What broke:

* My recent changes to the TX queue handling has resulted in the driver
not keeping the hardware queue properly filled when doing non-aggregate
traffic. I have a patch to commit soon which fixes this situation
(albeit by reminding me about how my ath driver locking isn't working
out, sigh.)

So if you want to test this without updating to the next set of patches
that I commit, just bump the sysctl dev.ath.X.hwq_limit from 2 to 32.

Tested:

* AR5416 <-> AR5416, with ampdu disabled, HT40, 5GHz, MCS12+Short-GI.
I saw 30mbit/sec in both directions using a bidirectional UDP test.


# 250783 18-May-2013 adrian

Be (very) careful about how to add more TX DMA work.

The list-based DMA engine has the following behaviour:

* When the DMA engine is in the init state, you can write the first
descriptor address to the QCU TxDP register and it will work.

* Then when it hits the end of the list (ie, it either hits a NULL
link pointer, OR it hits a descriptor with VEOL set) the QCU
stops, and the TxDP points to the last descriptor that was transmitted.

* Then when you want to transmit a new frame, you can then either:
+ write the head of the new list into TxDP, or
+ you write the head of the new list into the link pointer of the
last completed descriptor (ie, where TxDP points), then kick
TxE to restart transmission on that QCU>

* The hardware then will re-read the descriptor to pick up the link
pointer and then jump to that.

Now, the quirks:

* If you write a TxDP when there's been no previous TxDP (ie, it's 0),
it works.

* If you write a TxDP in any other instance, the TxDP write may actually
fail. Thus, when you start transmission, it will re-read the last
transmitted descriptor to get the link pointer, NOT just start a new
transmission.

So the correct thing to do here is:

* ALWAYS use the holding descriptor (ie, the last transmitted descriptor
that we've kept safe) and use the link pointer in _THAT_ to transmit
the next frame.

* NEVER write to the TxDP after you've done the initial write.

* .. also, don't do this whilst you're also resetting the NIC.

With this in mind, the following patch does basically the above.

* Since this encapsulates Sam's issues with the QCU behaviour w/ TDMA,
kill the TDMA special case and replace it with the above.

* Add a new TXQ flag - PUTRUNNING - which indicates that we've started
DMA.

* Clear that flag when DMA has been shutdown.

* Ensure that we're not restarting DMA with PUTRUNNING enabled.

* Fix the link pointer logic during TXQ drain - we should always ensure
the link pointer does point to something if there's a list of frames.
Having it be NULL as an indication that DMA has finished or during
a reset causes trouble.

Now, given all of this, i want to nuke axq_link from orbit. There's now HAL
methods to get and set the link pointer of a descriptor, so what we
should do instead is to update the right link pointer.

* If there's a holding descriptor and an empty TXQ list, set the
link pointer of said holding descriptor to the new frame.

* If there's a non-empty TXQ list, set the link pointer of the
last descriptor in the list to the new frame.

* Nuke axq_link from orbit.

Note:

* The AR9380 doesn't need this. FIFO TX writes are atomic. As long as
we don't append to a list of frames that we've already passed to the
hardware, all of the above doesn't apply. The holding descriptor stuff
is still needed to ensure the hardware can re-read a completed
descriptor to move onto the next one, but we restart DMA by pushing in
a new FIFO entry into the TX QCU. That doesn't require any real
gymnastics.

Tested:

* AR5210, AR5211, AR5212, AR5416, AR9380 - STA mode.


# 250665 15-May-2013 adrian

Implement my first cut at "correct" node power-save and
PS-POLL support.

This implements PS-POLL awareness i nthe

* Implement frame "leaking", which allows for a software queue
to be scheduled even though it's asleep
* Track whether a frame has been leaked or not
* Leak out a single non-AMPDU frame when transmitting aggregates
* Queue BAR frames if the node is asleep
* Direct-dispatch the rest of control and management frames.
This allows for things like re-association to occur (which involves
sending probe req/resp as well as assoc request/response) when
the node is asleep and then tries reassociating.
* Limit how many frames can set in the software node queue whilst
the node is asleep. net80211 is already buffering frames for us
so this is mostly just paranoia.
* Add a PS-POLL method which leaks out a frame if there's something
in the software queue, else it calls net80211's ps-poll routine.
Since the ath PS-POLL routine marks the node as having a single frame
to leak, either a software queued frame would leak, OR the next queued
frame would leak. The next queued frame could be something from the
net80211 power save queue, OR it could be a NULL frame from net80211.

TODO:

* Don't transmit further BAR frames (eg via a timeout) if the node is
currently asleep. Otherwise we may end up exhausting management frames
due to the lots of queued BAR frames.

I may just undo this bit later on and direct-dispatch BAR frames
even if the node is asleep.

* It would be nice to burst out a single A-MPDU frame if both ends
support this. I may end adding a FreeBSD IE soon to negotiate
this power save behaviour.

* I should make STAs timeout of power save mode if they've been in power
save for more than a handful of seconds. This way cards that get
"stuck" in power save mode don't stay there for the "inactivity" timeout
in net80211.

* Move the queue depth check into the driver layer (ath_start / ath_transmit)
rather than doing it in the TX path.

* There could be some naughty corner cases with ps-poll leaking.
Specifically, if net80211 generates a NULL data frame whilst another
transmitter sends a normal data frame out net80211 output / transmit,
we need to ensure that the NULL data frame goes out first.
This is one of those things that should occur inside the VAP/ic TX lock.
Grr, more investigations to do..

Tested:

* STA: AR5416, AR9280
* AP: AR5416, AR9280, AR9160


# 250609 13-May-2013 adrian

Since the node state is 100% back under the TX lock, just kill the use
of atomics.

I'll re-think this nonsense later.


# 250607 13-May-2013 adrian

This lock only protects the rate control state for now, mention this.


# 250391 08-May-2013 adrian

Fix the holding descriptor logic to actually be "right" (for values
of "right".)

Flip back on the "always continue TX DMA using the holding descriptor"
code - by always setting ATH_BUF_BUSY and never setting axq_link to NULL.

Since the holding descriptor is accessed via txq->axq_link and _that_
is done behind the TXQ lock rather than the TX path lock, the holding
descriptor stuff itself needs to be behind the TXQ lock.

So, do the mental gymnastics needed to do this.

I've not seen any of the hardware failures that I was seeing when
I last tried to do this.

Tested:

* AR5416, STA mode


# 250326 07-May-2013 adrian

Re-work how transmit buffer limits are enforced - partly to fix the PR,
but partly to just tidy up things.

The problem here - there are too many TX buffers in the queue! By the
time one needs to transmit an EAPOL frame (for this PR, it's the response
to the group rekey notification from the AP) there are no ath_buf entries
free and the EAPOL frame doesn't go out.

Now, the problem!

* Enforcing the TX buffer limitation _before_ we dequeue the frame?
Bad idea. Because..
* .. it means I can't check whether the mbuf has M_EAPOL set.

The solution(s):

* De-queue the frame first
* Don't bother doing the TX buffer minimum free check until after
we know whether it's an EAPOL frame or not.
* If it's an EAPOL frame, allocate the buffer from the mgmt pool
rather than the default pool.

Whilst I'm here:

* Add a tweak to limit how many buffers a single node can acquire.
* Don't enforce that for EAPOL frames.
* .. set that to default to 1/4 of the available buffers, or 32,
whichever is more sane.

This doesn't fix issues due to a sleeping node or a very poor performing
node; but this doesn't make it worse.

Tested:

* AR5416 STA, TX'ing 100+ mbit UDP to an AP, but only 50mbit being received
(thus the TX queue fills up.)
* .. with CCMP / WPA2 encryption configured
* .. and the group rekey time set to 10 seconds, just to elicit the
behaviour very quickly.

PR: kern/138379


# 249639 19-Apr-2013 adrian

Use uint32_t for fields that are fetched via ath_hal_getcapability().


# 249565 16-Apr-2013 adrian

Use a per-RX-queue deferred list, rather than a single deferred list for
both queues.

Since ath_rx_pkt() does multi-mbuf frame recombining based on the RX queue,
this needs to occur.

Tested:

* AR9380 (XB112), hostap mode


# 248745 26-Mar-2013 adrian

Add per-TXQ EDMA FIFO staging queue support.

Each set of frames pushed into a FIFO is represented by a list of
ath_bufs - the first ath_buf in the FIFO list is marked with
ATH_BUF_FIFOPTR; the last ath_buf in the FIFO list is marked with
ATH_BUF_FIFOEND.

Multiple lists of frames are just glued together in the TAILQ as per
normal - except that at the end of a FIFO list, the descriptor link
pointer will be NULL and it'll be tagged with ATH_BUF_FIFOEND.

For non-EDMA chipsets this is a no-op - the ath_txq frame list (axq_q)
stays the same and is treated the same.

For EDMA chipsets the frames are pushed into axq_q and then when
the FIFO is to be (re) filled, frames will be moved onto the FIFO
queue and then pushed into the FIFO.

So:

* Add a new queue in each hardware TXQ (ath_txq) for staging FIFO frame
lists. It's a TAILQ (like the normal hardware frame queue) rather than
the ath9k list-of-lists to represent FIFO entries.

* Add new ath_buf flags - ATH_TX_FIFOPTR and ATH_TX_FIFOEND.

* When allocating ath_buf entries, clear out the flag value before
returning it or it'll end up having stale flags.

* When cloning ath_buf entries, only clone ATH_BUF_MGMT. Don't clone
the FIFO related flags.

* Extend ath_tx_draintxq() to first drain the FIFO staging queue, _then_
drain the normal hardware queue.

Tested:

* AR9280, hostap
* AR9280, STA
* AR9380/AR9580 - hostap

TODO:

* Test on other chipsets, just to be thorough.


# 248671 23-Mar-2013 adrian

Overhaul the TXQ locking (again!) as part of some beacon/cabq timing
related issues.

Moving the TX locking under one lock made things easier to progress on
but it had one important side-effect - it increased the latency when
handling CABQ setup when sending beacons.

This commit introduces a bunch of new changes and a few unrelated changs
that are just easier to lump in here.

The aim is to have the CABQ locking separate from other locking.
The CABQ transmit path in the beacon process thus doesn't have to grab
the general TX lock, reducing lock contention/latency and making it
more likely that we'll make the beacon TX timing.

The second half of this commit is the CABQ related setup changes needed
for sane looking EDMA CABQ support. Right now the EDMA TX code naively
assumes that only one frame (MPDU or A-MPDU) is being pushed into each
FIFO slot. For the CABQ this isn't true - a whole list of frames is
being pushed in - and thus CABQ handling breaks very quickly.

The aim here is to setup the CABQ list and then push _that list_ to
the hardware for transmission. I can then extend the EDMA TX code
to stamp that list as being "one" FIFO entry (likely by tagging the
last buffer in that list as "FIFO END") so the EDMA TX completion code
correctly tracks things.

Major:

* Migrate the per-TXQ add/removal locking back to per-TXQ, rather than
a single lock.

* Leave the software queue side of things under the ATH_TX_LOCK lock,
(continuing) to serialise things as they are.

* Add a new function which is called whenever there's a beacon miss,
to print out some debugging. This is primarily designed to help
me figure out if the beacon miss events are due to a noisy environment,
issues with the PHY/MAC, or other.

* Move the CABQ setup/enable to occur _after_ all the VAPs have been
looked at. This means that for multiple VAPS in bursted mode, the
CABQ gets primed once all VAPs are checked, rather than being primed
on the first VAP and then having frames appended after this.

Minor:

* Add a (disabled) twiddle to let me enable/disable cabq traffic.
It's primarily there to let me easily debug what's going on with beacon
and CABQ setup/traffic; there's some DMA engine hangs which I'm finally
trying to trace down.

* Clear bf_next when flushing frames; it should quieten some warnings
that show up when a node goes away.

Tested:

* AR9280, STA/hostap, up to 4 vaps (staggered)
* AR5416, STA/hostap, up to 4 vaps (staggered)

TODO:

* (Lots) more AR9380 and later testing, as I may have missed something here.
* Leverage this to fix CABQ hanling for AR9380 and later chips.
* Force bursted beaconing on the chips that default to staggered beacons and
ensure the CABQ stuff is all sane (eg, the MORE bits that aren't being
correctly set when chaining descriptors.)


# 248529 19-Mar-2013 adrian

Break out the RX completion path into "FIFO check / refill" and
"complete RX frames."

The 128 entry RX FIFO is really easy to fill up and miss refilling
when it's done in the ath taskq - as that gets blocked up doing
RX completion, TX completion and other random things.

So the 128 entry RX FIFO now gets emptied and refilled in the ath_intr()
task (and it grabs / releases locks, so now ath_intr() can't just be
a FAST handler yet!) but the locks aren't held for very long. The
completion part is done in the ath taskqueue context.

Details:

* Create a new completed frame list - sc->sc_rx_rxlist;
* Split the EDMA RX process queue into two halves - one that
processes the RX FIFO and refills it with new frames; another
that completes the completed frame list;
* When tearing down the driver, flush whatever is in the deferred
queue as well as what's in the FIFO;
* Create two new RX methods - one that processes all RX queues,
one that processes the given RX queue. When MSI is implemented,
we get told which RX queue the interrupt came in on so we can
specifically schedule that. (And I can do that with the non-MSI
path too; I'll figure that out later.)
* Convert the legacy code over to use these new RX methods;
* Replace all the instances of the RX taskqueue enqueue with a call
to a relevant RX method to enqueue one or all RX queues.

Tested:

* AR9380, STA
* AR9580, STA
* AR5413, STA


# 248311 15-Mar-2013 adrian

Add locking around the new holdingbf code.

Since this is being done during buffer free, it's a crap shoot whether
the TX path lock is held or not. I tried putting the ath_freebuf() code
inside the TX lock and I got all kinds of locking issues - it turns out
that the buffer free path sometimes is called with the lock held and
sometimes isn't. So I'll go and fix that soon.

Hence for now the holdingbf buffers are protected by the TXBUF lock.


# 248264 14-Mar-2013 adrian

Implement "holding buffers" per TX queue rather than globally.

When working on TDMA, Sam Leffler found that the MAC DMA hardware
would re-read the last TX descriptor when getting ready to transmit
the next one. Thus the whole ATH_BUF_BUSY came into existance -
the descriptor must be left alone (very specifically the link pointer
must be maintained) until the hardware has moved onto the next frame.

He saw this in TDMA because the MAC would be frequently stopping during
active transmit (ie, when it wasn't its turn to transmit.)

Fast-forward to today. It turns out that this is a problem not with
a single MAC DMA instance, but with each QCU (from 0->9). They each
maintain separate descriptor pointers and will re-read the last
descriptor when starting to transmit the next.

So when your AP is busy transmitting from multiple TX queues, you'll
(more) frequently see one QCU stopped, waiting for a higher-priority QCU
to finsh transmitting, before it'll go ahead and continue. If you mess
up the descriptor (ie by freeing it) then you're short of luck.

Thanks to rpaulo for sticking with me whilst I diagnosed this issue
that he was quite reliably triggering in his environment.

This is a reimplementation; it doesn't have anything in common with
the ath9k or the Qualcomm Atheros reference driver.

Now - it in theory doesn't apply on the EDMA chips, as long as you
push one complete frame into the FIFO at a time. But the MAC can DMA
from a list of frames pushed into the hardware queue (ie, you concat
'n' frames together with link pointers, and then push the head pointer
into the TXQ FIFO.) Since that's likely how I'm going to implement
CABQ handling in hostap mode, it's likely that I will end up teaching
the EDMA TX completion code about busy buffers, just to be "sure"
this doesn't creep up.

Tested - iperf ap->sta and sta->ap (with both sides running this code):

* AR5416 STA
* AR9160/AR9220 hostap

To validate that it doesn't break the EDMA (FIFO) chips:

* AR9380, AR9485, AR9462 STA

Using iperf with the -S <tos byte decimal value> to set the TCP client
side DSCP bits, mapping to different TIDs and thus different TX queues.

TODO:

* Make this work on the EDMA chips, if we end up pushing lists of frames
to the hardware (eg how we eventually will handle cabq in hostap/ibss
mode.)


# 247774 04-Mar-2013 adrian

add a method to set/clear the VMF field in the TX descriptor.

Obtained from: Qualcomm Atheros


# 247366 26-Feb-2013 adrian

Add in the STBC TX/RX capability support into the HAL and driver.

The HAL already included the STBC fields; it just needed to be exposed
to the driver and net80211 stack.

This should allow single-stream STBC TX and RX to be negotiated; however
the driver and rate control code currently don't do anything with it.


# 247286 25-Feb-2013 adrian

Begin adding support to explicitly set the current chainmask.

Right now the only way to set the chainmask is to set the hardware
configured chainmask through capabilities. This is fine for forcing
the chainmask to be something other than what the hardware is capable
of (eg to reduce TX/RX to one connected antenna) but it does change what
the HAL hardware chainmask configuration is.

For operational mode changes, it (may?) make sense to separately control
the TX/RX chainmask.

Right now it's done as part of ar5416_reset.c - ar5416UpdateChainMasks()
calculates which TX/RX chainmasks to enable based on the operating mode.
(1 for legacy and whatever is supported for 11n operation.) But doing
this in the HAL is suboptimal - the driver needs to know the currently
configured chainmask in order to correctly enable things for each
TX descriptor. This is currently done by overriding the chainmask
config in the ar5416 TX routines but this has to disappear - the AR9300
HAL support requires the driver to dynamically set the TX chainmask based
on the TX power and TX rate in order to meet mini-PCIe slot power
requirements.

So:

* Introduce a new HAL method to set the operational chainmask variables;
* Introduce null methods for the previous generation chipsets;
* Add new driver state to record the current chainmask separate from
the hardware configured chainmask.

Part #2 of this will involve disabling ar5416UpdateChainMasks() and moving
it into the driver; as well as properly programming the TX chainmask
based on the currently configured HAL chainmask.

Tested:

* AR5416, STA mode - both legacy (11a/11bg) and 11n rates - verified
that AR_SELFGEN_MASK (the chainmask used for self-generated frames like
ACKs and RTSes) is correct, as well as the TX descriptor contents is
correct.


# 247087 21-Feb-2013 adrian

Add an option to allow the minimum number of delimiters to be tweaked.

This is primarily for debugging purposes.

Tested:

* AR5416, STA mode


# 247085 21-Feb-2013 adrian

Add a new option to limit the maximum size of aggregates.
The default is to limit them to what the hardware is capable of.

Add sysctl twiddles for both the non-RTS and RTS protected aggregate
generation.

Whilst here, add some comments about stuff that I've discovered during
my exploration of the TX aggregate / delimiter setup path from the
reference driver.


# 246745 13-Feb-2013 adrian

Pull out the if_transmit() work and revert back to ath_start().

My changed had some rather significant behavioural changes to throughput.
The two issues I noticed:

* With if_start and the ifnet mbuf queue, any temporary latency
would get eaten up by some mbufs being queued. With ath_transmit()
queuing things to ath_buf's, I'd only get 512 TX buffers before I
couldn't queue any further frames.

* There's also some non-zero latency involved with TX being pushed
into a taskqueue via direct dispatch. Any time the scheduler didn't
immediately schedule the ath TX task would cause extra latency.
Various 1ge/10ge drivers implement both direct dispatch (if the TX
lock can be acquired) and deferred task transmission (if the TX lock
can't be acquired), with frames being pushed into a drbd queue.
I'll have to do this at some point, but until I figure out how to
deal with 802.11 fragments, I'll have to wait a while longer.

So what I saw:

* lots of extra latency, specially under load - if the taskqueue
wasn't immediately scheduled, things went pear shaped;

* any extra latency would result in TX ath_buf's taking their sweet time
being replenished, so any further calls to ath_transmit() would drop
mbufs.

* .. yes, there's no explicit backpressure here - things are just dropped.
Eek.

With this, the general performance has gone up, but those subtle if_start()
related race conditions are back. For some reason, this is doubly-obvious
with the AR5416 NIC and I don't quite understand why yet.

There's an unrelated issue with AR5416 performance in STA mode (it's
fine in AP mode when bridging frames, weirdly..) that requires a little
further investigation. Specifically - it works fine on a Lenovo T40
(single core CPU) running a March 2012 9-STABLE kernel, but a Lenovo T60
(dual core) running an early November 2012 kernel behaves very poorly.
The same hardware with an AR9160 or AR9280 behaves perfectly.


# 246453 07-Feb-2013 adrian

Create a new TX lock specifically for queuing frames.

This now separates out the act of queuing frames from the act of running
TX and TX completion.


# 245708 21-Jan-2013 adrian

Migrate CLRDMASK to be a per-node flag, rather than a per-TID flag.

This is easily possible now that the TX is protected by a single
lock, rather than a per-TXQ (and thus per-TID) lock.

Only set CLRDMASK if none of the destinations are filtered.
This likely will need some tuning when it comes time to do UASPD/PS-POLL
TX, however at that point it should be manually set anyway.

Tested:

* AR9280, STA mode

TODO:

* More thorough testing in AP mode
* test other chipsets, just to be safe/sure.


# 245465 15-Jan-2013 adrian

Implement frame (data) transmission using if_transmit(), rather than
if_start().

This removes the overlapping data path TX from occuring, which
solves quite a number of the potential TX queue races in ath(4).
It doesn't fix the net80211 layer TX queue races and it doesn't
fix the raw TX path yet, but it's an important step towards this.

This hasn't dropped the TX performance in my testing; primarily
because now the TX path can quickly queue frames and continue
along processing.

This involves a few rather deep changes:

* Use the ath_buf as a queue placeholder for now, as we need to be
able to support queuing a list of mbufs (ie, when transmitting
fragments) and m_nextpkt can't be used here (because it's what is
joining the fragments together)

* if_transmit() now simply allocates the ath_buf and queues it to
a driver TX staging queue.

* TX is now moved into a taskqueue function.

* The TX taskqueue function now dequeues and transmits frames.

* Fragments are handled correctly here - as the current API passes
the fragment list as one mbuf list (joined with m_nextpkt) through
to the driver if_transmit().

* For the couple of places where ath_start() may be called (mostly
from net80211 when starting the VAP up again), just reimplement
it using the new enqueue and taskqueue methods.

What I don't like (about this work and the TX code in general):

* I'm using the same lock for the staging TX queue management and the
actual TX. This isn't required; I'm just being slack.

* I haven't yet moved TX to a separate taskqueue (but the taskqueue is
created); it's easy enough to do this later if necessary. I just need
to make sure it's a higher priority queue, so TX has the same
behaviour as it used to (where it would preempt existing RX..)

* I need to re-review the TX path a little more and make sure that
ieee80211_node_*() functions aren't called within the TX lock.
When queueing, I should just push failed frames into a queue and
when I'm wrapping up the TX code, unlock the TX lock and
call ieee80211_node_free() on each.

* It would be nice if I could hold the TX lock for the entire
TX and TX completion, rather than this release/re-acquire behaviour.
But that requires that I shuffle around the TX completion code
to handle actual ath_buf free and net80211 callback/free outside
of the TX lock. That's one of my next projects.

* the ic_raw_xmit() path doesn't use this yet - so it still has
sequencing problems with parallel, overlapping calls to the
data path. I'll fix this later.

Tested:

* Hostap - AR9280, AR9220
* STA - AR5212, AR9280, AR5416


# 245002 03-Jan-2013 adrian

Don't call the spectral methods for NICS that don't implement them.


# 244951 02-Jan-2013 adrian

Add a new (skeleton) spectral mode manager module.


# 244947 01-Jan-2013 adrian

Add spectral HAL accessor methods.


# 244109 11-Dec-2012 adrian

There's no need to use a TXQ pointer here; we specifically need the
hardware queue ID when queuing to EDMA descriptors.

This is a small part of trying to reduce the size of ath_buf entries.


# 243786 02-Dec-2012 adrian

Delete the per-TXQ locks and replace them with a single TX lock.

I couldn't think of a way to maintain the hardware TXQ locks _and_ layer
on top of that per-TXQ software queuing and any other kind of fine-grained
locks (eg per-TID, or per-node locks.)

So for now, to facilitate some further code refactoring and development
as part of the final push to get software queue ps-poll and u-apsd handling
into this driver, just do away with them entirely.

I may eventually bring them back at some point, when it looks slightly more
architectually cleaner to do so. But as it stands at the present, it's
not really buying us much:

* in order to properly serialise things and not get bitten by scheduling
and locking interactions with things higher up in the stack, we need to
wrap the whole TX path in a long held lock. Otherwise we can end up
being pre-empted during frame handling, resulting in some out of order
frame handling between sequence number allocation and encryption handling
(ie, the seqno and the CCMP IV get out of sequence);

* .. so whilst that's the case, holding the lock for that long means that
we're acquiring and releasing the TXQ lock _inside_ that context;

* And we also acquire it per-frame during frame completion, but we currently
can't hold the lock for the duration of the TX completion as we need
to call net80211 layer things with the locks _unheld_ to avoid LOR.

* .. the other places were grab that lock are reset/flush, which don't happen
often.

My eventual aim is to change the TX path so all rejected frame transmissions
and all frame completions result in any ieee80211_free_node() calls to occur
outside of the TX lock; then I can cut back on the amount of locking that
goes on here.

There may be some LORs that occur when ieee80211_free_node() is called when
the TX queue path fails; I'll begin to address these in follow-up commits.


# 243425 23-Nov-2012 adrian

Add the HAL wrapper for settsf64.


# 242853 10-Nov-2012 kevlo

Fix the build.


# 242782 08-Nov-2012 adrian

Add some hooks into the driver to attach, detach and record EDMA descriptor
events.

This is primarily for the TX EDMA and TX EDMA completion. I haven't yet
tied it into the EDMA RX path or the legacy TX/RX path.

Things that I don't quite like:

* Make the pointer type 'void' in ath_softc and have if_ath_alq*()
return a malloc'ed buffer. That would remove the need to include
if_ath_alq.h in if_athvar.h.
* The sysctl setup needs to be cleaned up.


# 242527 03-Nov-2012 adrian

Add a new HAL call to extract out the HAL enterprise bits from the
AR9300 HAL.


# 242510 03-Nov-2012 adrian

HAL API updates, from the previous couple of HAL commits.


# 242391 31-Oct-2012 adrian

I give up - introduce a TX lock to serialise TX operations.

I've tried serialising TX using queues and such but unfortunately
due to how this interacts with the locking going on elsewhere in the
networking stack, the TX task gets delayed, resulting in quite a
noticable throughput loss:

* baseline TCP for 2x2 11n HT40 is ~ 170mbit/sec;
* TCP for TX task in the ath taskq, with the RX also going on - 80mbit/sec;
* TCP for TX task in a separate, second taskq - 100mbit/sec.

So for now I'm going with the Linux wireless stack approach - lock tx
early. The linux code does in the wireless stack, before the 802.11
state stuff happens and before it's punted to the driver.
But TX locking needs to also occur at the driver layer as the TX
completion code _also_ begins to drain the ifnet TX queue.

Whilst I'm here, add some KTR traces for the TX path.

Note:

* This really should be done at the net80211 layer (as well, at least.)
But that'll have to wait for a little more thought to happen.


# 242271 28-Oct-2012 adrian

Begin fleshing out some software queue awareness for TIM handling with
the power save queue.

* introduce some new ATH_NODE lock protected fields, tracking the
net80211 psq and TIM state;
* when doing buffer transitions - ie, when sending and completing
buffers - check the state of the SWQ and update the TIM appropriately.
* when clearing the TIM bit, if the SWQ is not empty then delay clearing
it.

This is racy, but it's no less racy than the current net80211 power
save queue management code. Specifically, with multiple TX threads,
it's quite plausible that parallel state updates will race and the
TIM will be left in an inconsistent state. I'll address that in
a follow-up commit.


# 241567 14-Oct-2012 adrian

Track the total number of software queued frames in an atomic variable
stashed away in ath_node.

As much as I tried to stuff that behind the ATH_NODE lock, unfortunately
the locking is just too plain hairy (for me! And I wrote it!) to do
cleanly. Hence using atomics here instead of a lock. The ATH_NODE lock
just isn't currently used anywhere besides the rate control updates.

If in the future everything gets migrated back to using a single ATH_NODE
lock or a single global ATH_TX lock (ie, a single TX lock for all TX and
TX completion) then fine, I'll remove the atomics.


# 241566 14-Oct-2012 adrian

Stop abusing the ATH_TID_*() queue macros for filtered frames and give
them their own macro set.


# 241559 14-Oct-2012 adrian

Push the actual TX processing into the ath taskqueue, rather than having
it run out of multiple concurrent contexts.

Right now the ath(4) TX processing is a bit hairy. Specifically:

* It was running out of ath_start(), which could occur from multiple
concurrent sending processes (as if_start() can be started from multiple
sending threads nowdays.. sigh)

* during RX if fast frames are enabled (so not really at the moment, not
until I fix this particular feature again..)

* during ath_reset() - so anything which calls that

* during ath_tx_proc*() in the ath taskqueue - ie, TX is attempted again
after TX completion, as there's now hopefully some ath_bufs available.

* Then, the ic_raw_xmit() method can queue raw frames for transmission
at any time, from any net80211 TX context. Ew.

This has caused packet ordering issues in the past - specifically,
there's absolutely no guarantee that preemption won't occuring _during_
ath_start() by the TX completion processing, which will call ath_start()
again. It's a mess - 802.11 really, really wants things to be in
sequence or things go all kinds of loopy.

So:

* create a new task struct for TX'ing;
* make the if_start method simply queue the task on the ath taskqueue;
* make ath_start() just be called by the new TX task;
* make ath_tx_kick() just schedule the ath TX task, rather than directly
calling ath_start().

Now yes, this means that I've taken a step backwards in terms of
concurrency - TX -and- RX now occur in the same single-task taskqueue.
But there's nothing stopping me from separating out the TX / TX completion
code into a separate taskqueue which runs in parallel with the RX path,
if that ends up being appropriate for some platforms.

This fixes the CCMP/seqno concurrency issues that creep up when you
transmit large amounts of uni-directional UDP traffic (>200MBit) on a
FreeBSD STA -> AP, as now there's only one TX context no matter what's
going on (TX completion->retry/software queue,
userland->net80211->ath_start(), TX completion -> ath_start());
but it won't fix any concurrency issues between raw transmitted frames
and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other
TID 16 multicast traffic that gets put on the CABQ.) That is going to
require a bunch more re-architecture before it's feasible to fix.

In any case, this is a big step towards making the majority of the TX
path locking irrelevant, as now almost all TX activity occurs in the
taskqueue.

Phew.


# 241336 07-Oct-2012 adrian

Migrate the TID TXQ accesses to a new set of macros, rather than reusing
the ATH_TXQ_* macros.

* Introduce the new macros;
* rename the TID queue and TID filtered frame queue so the compiler
tells me I'm using the wrong macro.

These should correspond 1:1 to the existing code.


# 241170 03-Oct-2012 adrian

Pause and unpause the software queues for a given node based on the
net80211 node power save state.

* Add an ATH_NODE_UNLOCK_ASSERT() check
* Add a new node field - an_is_powersave
* Pause/unpause the queue based on the node state
* Attempt to handle net80211 concurrency issues so the queue
doesn't get paused/unpaused more than once at a time from
the net80211 power save code.

Whilst here (and breaking my usual rule), set CLRDMASK when a queue
is unpaused, regardless of whether the queue has some pending traffic.
This means the first frame from that TID (now or later) will hvae
CLRDMASK set.

Also whilst here, bump the swretrymax counters whenever the
filtered frames code expires a frame. Again, breaking my rule, but
this is just a statistics thing rather than a functional change.

This doesn't fix ps-poll (but it doesn't break it too much worse
than it is at the present) or correcting the TID updates.
That's next on the list.

Tested:
* AR9220 AP (Atheros AP96 reference design)
* Macbook Pro and LG Optimus 1 Android phone, both setting
and clearing power save state (but not using PS-POLL.)


# 240899 24-Sep-2012 adrian

Migrate the ath(4) KTR logging to use an ATH_KTR() macro.

This should eventually be unified with ATH_DEBUG() so I can get both
from one macro; that may take some time.

Add some new probes for TX and TX completion.


# 240585 16-Sep-2012 adrian

Add a per-TID filter queue and filter state bits.

These are intended for software TX filtering support, where the NIC
decides there has been too many successive failues to a destination
and will filter it.

Although the filtering is done per-destination (via the keycache),
the state and queue is kept per-TID for now. It simplifies the overall
architecture design and locking.

Whilst here, add ATH_TID_UNLOCK_ASSERT().


# 239656 24-Aug-2012 adrian

Add an accessor macro for getting access to the default DFS parameters.

PR: kern/170904


# 239282 15-Aug-2012 adrian

Implement a sequential descriptor ID value and stuff it in the ath_buf.

This will be used by the EDMA TX code to assign descriptor IDs in order
to provide some debugging.


# 239261 14-Aug-2012 adrian

Add an assertion to check that the given TXQ is _not_ locked.


# 239205 11-Aug-2012 adrian

Revert the ath_tx_draintxq() method, and instead teach it the minimum
necessary to "do" EDMA.

It was just using the TX completion status for logging information about
the descriptor completion. Since with EDMA we don't know this without
checking the TX completion FIFO, we can't provide this information.
So don't.


# 239204 11-Aug-2012 adrian

Break out ath_draintxq() into a method and un-methodize ath_tx_processq().

Now that I understand what's going on with this, I've realised that
it's going to be quite difficult to implement a processq method in
the EDMA case. Because there's a separate TX status FIFO, I can't
just run processq() on each EDMA TXQ to see what's finished.
i have to actually run the TX status queue and handle individual
TXQs.

So:

* unmethodize ath_tx_processq();
* leave ath_tx_draintxq() as a method, as it only uses the completion status
for debugging rather than actively completing the frames (ie, all frames
here are failed);
* Methodize ath_draintxq().

The EDMA ath_draintxq() will have to take care of running the TX
completion FIFO before (potentially) freeing frames in the queue.

The only two places where ath_tx_draintxq() (on a single TXQ) are used:

* ath_draintxq(); and
* the CABQ handling in the beacon setup code - it drains the CABQ before
populating the CABQ with frames for a new beacon (when doing multi-VAP
operation.)

So it's quite possible that once I methodize the CABQ and beacon handling,
I can just drop ath_tx_draintxq() in its entirety.

Finally, it's also quite possible that I can remove ath_tx_draintxq()
in the future and just "teach" it to not check the status when doing
EDMA.


# 239197 11-Aug-2012 adrian

Begin fleshing out the TX FIFO support.

* Add ATH_TXQ_FIRST() for easy tasting of what's on the list;
* Add an "axq_fifo_depth" for easy tracking of how deep the current
FIFO is;
* Flesh out the handoff (mcast, hw) functions;
* Begin fleshing out a TX ISR proc, which tastes the TX status FIFO.

The legacy hardware stuffs the TX completion at the end of the final frame
descriptor (or final sub-frame when doing aggregate.) So it's feasible
to do a per-TXQ drain and process, as the needed info is right there.

For EDMA hardware, there's a separate TX completion FIFO. So the TX
process routine needs to read the single FIFO and then process the
frames in each hardware queue.

This makes it difficult to do a per-queue process, as you'll end up with
frames in the TX completion FIFO for a different TXQ to the one you've
passed to ath_tx_draintxq() or ath_tx_processq().

Testing:

I've tested the TX queue and TX completion code in hostap mode on an
AR9380. Beacon frames successfully transmit and the completion routine
is called. Occasional data frames end up in TXQ 1 and are also
successfully completed.

However, this requires some changes to the beacon code path as:

* The AR9380 beacon configuration API is now in TU/8, rather than
TU;
* The AR9380 TX API requires the rate control is setup using a call
to setup11nratescenario, rather than having the try0 series setup
(rate/tries for the first series); so the beacon won't go out.

I'll follow this up with commits to the beacon code.


# 239053 05-Aug-2012 adrian

Migrate the 802.11n ath_hal_chaintxdesc() API to use a buffer/segment
array, similar to what filltxdesc() uses.

This removes the last reference to ds_data in the TX path outside of
debugging statements. These need to be adjusted/fixed.

Tested:

* AR9280 STA/AP with iperf TCP traffic


# 239051 05-Aug-2012 adrian

Migrate the ath_hal_filltxdesc() API to take a list of buffer/seglen values.

The existing API only exposes 'seglen' (the current buffer (segment) length)
with the data buffer pointer set in 'ds_data'. This is fine for the legacy
DMA engine but it won't work for the EDMA engines.

The EDMA engine has a significantly different TX descriptor layout.

* The legacy DMA engine had a ds_data pointer at the same offset in the
descriptor for both TX and RX buffers;
* The EDMA engine has no ds_data for RX - the data is DMAed after the
descriptor;
* The EDMA engine has support for 4 TX buffer/segment pairs in the TX
DMA descriptor;
* The EDMA TX completion is in a different FIFO, and the driver will
'link' the status completion entry to a QCU by a "QCU ID".
I don't know why it's just not filled in by the hardware, alas.

So given that, here are the changes:

* Instead of directly fondling 'ds_data' in ath_desc, change the
ath_hal_filltxdesc() to take an array of buffer pointers as well
as segment len pointers;
* The EDMA TX completion status wants a descriptor and queue id.
This (for now) uses bf_state.bfs_txq and will extract the hardware QCU
ID from that.
* .. and this is ugly and wasteful; it should change to just store
the QCU in the bf_state and save 3/7 bytes in the process.

Now, the weird crap:

* The aggregate TX path was using bf_state->bfs_txq for the TXQ, rather than
taking a function argument. I've tidied that up.
* The multicast queue frames get put on a software TXQ and then that is
appended to the hardware CABQ when appropriate. So for now, make sure
that bf_state->bfs_txq points at the CABQ when adding frames to the
multicast queue.
* .. but the multicast queue TX path for now doesn't use the software
queue and instead
(a) directly sets up the descriptor contents at that point;
(b) the frames on the vap->avp_mcastq are then just appended wholesale
to the CABQ.
So for now, I don't have to worry about making the multicast path
work with aggregation or the per-TID software queue. Phew.

What's left to do:

* I need to modify the 11n ath_hal_chaintxdesc() API to do the same.
I'll do that in a subsequent commit.
* Remove bf_state.bfs_txq entirely and store the QCU as appropriate.
* .. then do the runtime "is this going on the right HWQ?" checks using
that, rather than comparing pointer values.

Tested on:

* AR9280 STA/AP
* AR5416 STA/AP


# 238961 31-Jul-2012 adrian

Allow 802.11n hardware to support multi-rate retry when RTS/CTS is
enabled.

The legacy (pre-802.11n) hardware doesn't support this - although
the AR5212 era hardware supports MRR, it doesn't have all the bits
needed to support MRR + RTS/CTS. The AR5416 and later support
a packet duration and RTS/CTS flags per rate scenario, so we should
support it.

Tested:

* AR9280, STA

PR: kern/170302


# 238931 31-Jul-2012 adrian

Migrate some more TX side setup routines to be methods.


# 238855 28-Jul-2012 adrian

Flesh out the initial TX FIFO storage for each hardware TX queue.


# 238838 27-Jul-2012 adrian

Bring this API in line with what the reference driver and Linux ath9k
was doing.

Obtained from: Qualcomm Atheros, Linux ath9k


# 238836 27-Jul-2012 adrian

Allocate a descriptor ring for EDMA TX completion status.

Configure the hardware with said ring physical address and size.


# 238731 23-Jul-2012 adrian

Add a new HAL method - the AR93xx and later NICs have a separate
TX descriptor ring for TX status completion. This API call will pass
the allocated buffer details to the HAL.


# 238710 23-Jul-2012 adrian

Begin separating out the TX DMA setup in preparation for TX EDMA support.

* Introduce TX DMA setup/teardown methods, mirroring what's done in
the RX path.

Although the TX DMA descriptor is setup via ath_desc_alloc() /
ath_desc_free(), there TX status descriptor ring will be allocated
in this path.

* Remove some of the TX EDMA capability probing from the RX path and
push it into the new TX EDMA path.


# 238709 23-Jul-2012 adrian

Flesh out a new DMA map for the EDMA TX completion status, as well
as a lock to go with that whole code path.


# 238708 23-Jul-2012 adrian

Begin modifying the descriptor allocation functions to support a variable
sized TX descriptor.

This is required for the AR93xx EDMA support which requires 128 byte
TX descriptors (which is significantly larger than the earlier
hardware.)


# 238608 19-Jul-2012 adrian

Use HAL_NUM_RX_QUEUES rather than a magic constant.


# 238607 19-Jul-2012 adrian

Break out the TX descriptor link field into HAL methods.

The DMA FIFO chips (AR93xx and later) differ slightly to th elegacy
chips:

* The RX DMA descriptors don't have a ds_link field;
* The TX DMA descriptors have a ds_link field however at a different
offset.

This is a reimplementation based on what the reference driver and ath9k
does.

A subsequent commit will enable it in the TX and beacon paths.

Obtained from: Linux ath9k, Qualcomm Atheros


# 238436 14-Jul-2012 adrian

Change the RX EDMA path to first complete the FIFO, then re-populate it
with fresh descriptors, before handling the frames.

Wrap it all in the RX locks.

Since the FIFO is very shallow (16 for HP, 128 for LP) it needs to be
drained and replenished very quickly. Ideally, I'll eventually move this
RX FIFO drain/fill into the interrupt handler, only deferring the actual
frame completion.


# 238433 14-Jul-2012 adrian

Create an RX queue lock.

Ideally these locks would go away and there'd be a single driver lock,
like what iwn(4) does. I'll worry about that later.


# 238316 09-Jul-2012 adrian

Convert sc_rxpending to a per-EDMA queue, and use that for the legacy code.

Prepare ath_rx_pkt() to handle multiple RX queues, and default the legacy
RX queue to use the HP queue.


# 238284 09-Jul-2012 adrian

Further preparations for the RX EDMA support.

Break out the DMA descriptor setup/teardown code into a method.
The EDMA RX code doesn't allocate descriptors, just ath_buf entries.


# 238280 09-Jul-2012 adrian

Introduce the EDMA related HAL capabilities.

Whilst here, fix a typo in a previous commit.

Obtained from: Qualcomm Atheros


# 238278 09-Jul-2012 adrian

Extend the RX HAL API to include the RX queue identifier.

The AR93xx and later chips support two RX FIFO queues - a high and low
priority queue.

For legacy chips, just assume the queues are high priority.

This is inspired by the reference driver but is a reimplementation of
the API and code.


# 238055 03-Jul-2012 adrian

Begin abstracting out the RX path in preparation for RX EDMA support.

The RX EDMA support requires a modified approach to the RX descriptor
handling.

Specifically:

* There's now two RX queues - high and low priority;
* The RX queues are implemented as FIFOs; they're now an array of pointers
to buffers;
* .. and the RX buffer and descriptor are in the same "buffer", rather than
being separate.

So to that end, this commit abstracts out most of the RX related functions
from the bulk of the driver. Notably, the RX DMA/buffer allocation isn't
updated, primarily because I haven't yet fleshed out what it should look
like.

Whilst I'm here, create a set of matching but mostly unimplemented EDMA
stubs.

Tested:

* AR9280, station mode

TODO:

* Thorough AP and other mode testing for non-EDMA chips;
* Figure out how to allocate RX buffers suitable for RX EDMA, including
correctly setting the mbuf length to compensate for the RX descriptor
and completion status area.


# 237953 02-Jul-2012 adrian

Bring over some further HAL capabilities from the Atheros HAL, as well
as an EDMA check function.

For the AR9003 and later NICs, different TX/RX DMA and descriptor handling
code will be conditional on the EDMA check.

Obtained from: Qualcomm Atheros


# 237153 16-Jun-2012 adrian

Shuffle some more fields in ath_buf so it's not too big.

This shaves off 20 bytes - from 288 bytes to 268 bytes.

However, it's still too big.


# 237152 16-Jun-2012 adrian

Shave four (or eight) bytes off of ath_buf - this field isn't used.


# 237046 14-Jun-2012 adrian

Shrink ath_buf a little more:

* Resize some types. In particular, bfs_seqno can be uint16_t for now.
Previous work would assign the unassigned seqno a value of -1, which
I obviously can't do here.

* Remove bfs_pktdur. It was in the original code but nothing so far uses
it.

This gets ath_buf down (on my i386 system) to 292 bytes from 300 bytes.
I'd rather it be much, much smaller.


# 237038 13-Jun-2012 adrian

Implement a global (all non-mgmt traffic) TX ath_buf limitation when
ath_start() is called.

This (defaults to 10 frames) gives for a little headway in the TX ath_buf
allocation, so buffer cloning is still possible.

This requires a lot omre experimenting and tuning.

It also doesn't stop a node/TID from consuming all of the available
ath_buf's, especially when the node is going through high packet loss
or only talking at a low TX rate. It also doesn't stop a paused TID
from taking all of the ath_bufs. I'll look at fixing that up in subsequent
commits.

PR: kern/168170


# 237000 13-Jun-2012 adrian

Implement a separate, smaller pool of ath_buf entries for use by management
traffic.

* Create sc_mgmt_txbuf and sc_mgmt_txdesc, initialise/free them appropriately.
* Create an enum to represent buffer types in the API.
* Extend ath_getbuf() and _ath_getbuf_locked() to take the above enum.
* Right now anything sent via ic_raw_xmit() allocates via ATH_BUFTYPE_MGMT.
This may not be very useful.
* Add ATH_BUF_MGMT flag (ath_buf.bf_flags) which indicates the current buffer
is a mgmt buffer and should go back onto the mgmt free list.
* Extend 'txagg' to include debugging output for both normal and mgmt txbufs.
* When checking/clearing ATH_BUF_BUSY, do it on both TX pools.

Tested:

* STA mode, with heavy UDP injection via iperf. This filled the TX queue
however BARs were still going out successfully.

TODO:

* Initialise the mgmt buffers with ATH_BUF_MGMT and then ensure the right
type is being allocated and freed on the appropriate list. That'd save
a write operation (to bf->bf_flags) on each buffer alloc/free.

* Test on AP mode, ensure that BAR TX and probe responses go out nicely
when the main TX queue is filled (eg with paused traffic to a TID,
awaiting a BAR to complete.)

PR: kern/168170


# 236873 11-Jun-2012 adrian

Introduce a new lock debug which is specifically for making sure the
_TID_ lock is held.

For now the TID lock is also the TXQ lock. This is just to make sure
that the right TXQ lock is held for the given TID.


# 236872 11-Jun-2012 adrian

Revert r233227 and followup commits as it breaks CCMP PN replay detection.

This showed up when doing heavy UDP throughput on SMP machines.

The problem with this is because the 802.11 sequence number is being
allocated separately to the CCMP PN replay number (which is assigned
during ieee80211_crypto_encap()).

Under significant throughput (200+ MBps) the TX path would be stressed
enough that frame TX/retry would force sequence number and PN allocation
to be out of order. So once the frames were reordered via 802.11 seqnos,
the CCMP PN would be far out of order, causing most frames to be discarded
by the receiver.

I've fixed this in some local work by being forced to:

(a) deal with the issues that lead to the parallel TX causing out of
order sequence numbers in the first place;
(b) fix all the packet queuing issues which lead to strange (but mostly
valid) TX.

I'll begin fixing these in a subsequent commit or five.

PR: kern/166190


# 236599 05-Jun-2012 adrian

Mostly revert previous commit(s). After doing a bunch of local testing,
it turns out that it negatively affects performance. I'm stil investigating
exactly why deferring the IO causes such negative TCP performance but
doesn't affect UDP preformance.

Leave the ath_tx_kick() change in there however; it's going to be useful
to have that there for if_transmit() work.

PR: kern/168649


# 236583 04-Jun-2012 adrian

Migrate the TX path to a taskqueue for now, until a better way of
implementing parallel TX and TX/RX completion can be done without
simply abusing long-held locks.

Right now, multiple concurrent ath_start() entries can result in
frames being dequeued out of order. Well, they're dequeued in order
fine, but if there's any preemption or race between CPUs between:

* removing the frame from the ifnet, and
* calling and runningath_tx_start(), until the frame is placed on a
software or hardware TXQ

Then although dequeueing the frame is in-order, queueing it to the hardware
may be out of order.

This is solved in a lot of other drivers by just holding a TX lock over
a rather long period of time. This lets them continue to direct dispatch
without races between dequeue and hardware queue.

Note to observers: if_transmit() doesn't necessarily solve this.
It removes the ifnet from the main path, but the same issue exists if
there's some intermediary queue (eg a bufring, which as an aside also
may pull in ifnet when you're using ALTQ.)

So, until I can sit down and code up a much better way of doing parallel
TX, I'm going to leave the TX path using a deferred taskqueue task.
What I will likely head towards is doing a direct dispatch to hardware
or software via if_transmit(), but it'll require some driver changes to
allow queues to be made without using the really large ath_buf / ath_desc
entries.

TODO:

* Look at how feasible it'll be to just do direct dispatch to
ath_tx_start() from if_transmit(), avoiding doing _any_ intermediary
serialisation into a global queue. This may break ALTQ for example,
so I have to be delicate.

* It's quite likely that I should break up ath_tx_start() so it
deposits frames onto the software queues first, and then only fill
in the 802.11 fields when it's being queued to the hardware.
That will make the if_transmit() -> software queue path very
quick and lightweight.

* This has some very bad behaviour when using ACPI and Cx states.
I'll do some subsequent analysis using KTR and schedgraph and file
a follow-up PR or two.

PR: kern/168649


# 236036 25-May-2012 adrian

Remove an unneeded field from ath_buf.


# 235972 25-May-2012 adrian

oops - ath_hal_disablepcie is actually destined for another purpose,
not to disable the PCIe PHY in prepration for reset.

Extend the enablepci method to have a "poweroff" flag, which if equal
to true means the hardware is about to go to sleep.


# 235957 25-May-2012 adrian

Prepare for improved (read: pcie) suspend/resume support.

* Flesh out the pcie disable method for 11n chips, as they were defaulting
to the AR5212 (empty) PCIe disable method.

* Add accessor macros for the HAL PCIe enable/disable calls.

* Call disable on ath_suspend()

* Call enable on ath_resume()

NOTE:

* This has nothing to do with the NIC sleep/run state - the NIC still
will stay in network-run state rather than supporting network-sleep
state. This is preparation work for supporting correct suspend/resume
WARs for the 11n PCIe NICs.

TODO:

* It may be feasible at this point to keep the chip powered down during
initial probe/attach and only power it up upon the first configure/reset
pass. This however would require correct (for values of "correct")
tracking of the NIC power configuration state from the driver and that
just isn't attempted at the moment.

Tested:

* AR9280 on my Lenovo T60, but with no suspend/resume pass (yet).


# 235804 22-May-2012 adrian

Re-up the TX ath_buf limit from 128 to 512.

I'll have to leave this high for now, until I've done some significant
surgery with how ath_bufs (and descriptors) are handled.

This should significantly cut down on the opportunities for a full TX
queue hanging traffic. I'll continue making things work though; I'm
mostly doing this for users. :)


# 235774 22-May-2012 adrian

Fix up some corner cases with aggregation handling.

I've come across a weird scenario in net80211 where two TX streams will
happily attempt to setup an aggregation session together.
If we're very lucky, it happens concurrently on separate CPUs and the
total lack of locking in the net80211 aggregation code causes this stuff
to race. Badly.

So >1 call would occur to the ath(4) addba start, but only one call would
complete to addba complete or timeout. The TID would thus stay paused.

The real fix is to implement some proper per-node (or maybe per-TID)
locking in net80211, which then could be leveraged by the ath(4) TX
aggregation code.

Whilst I'm at it, shuffle around the debugging messages a bit.
I like to keep people on their toes.


# 235491 15-May-2012 adrian

Migrate ath_debug and sc_debug from an int to a uint64_t / QUAD;
add some more BAR debugging logic.

* Change the definition of ath_debug and ath_softc.sc_debug from
int to uint64_t;
* Change the relevant sysctls;
* Add a new BAR TX debugging field;
* Use this in if_ath_tx.

This has been tested by using the sysctl program, which happily allows
for fields > 32 bits to be configured.


# 234873 01-May-2012 adrian

Change the MIB cycle count API to return HAL_BOOL, rather than uint32_t,
to return whether it was successful.

Add placeholder (blank) methods for previous chips, for both it and
the 11n extension channel busy call.


# 234369 17-Apr-2012 adrian

Run the fatal proc as a proc, rather than where it currently is.

Otherwise the reset path will sleep, which it can't do in this context.


# 234323 15-Apr-2012 adrian

Drop this down from 512 to 128 for now.

This may result in a bit of a throughput drop. However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.

This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.


# 234109 10-Apr-2012 adrian

Convert the flags over to a set of bit flags.


# 234090 10-Apr-2012 adrian

Squirrel away SYNC interrupt debugging if it's enabled in the HAL.

Bus errors will show up as various SYNC interrupts which will be passed
back up to ath_intr().


# 233967 07-Apr-2012 adrian

Store away the RTS aggregate limit from the HAL.

This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size. The AR5416 has a limitation on
RTS protected aggregates (8KiB).


# 233966 07-Apr-2012 adrian

Remove duplicate txflags field from ath_buf.

rename bf_state.bfs_flags to bf_state.bfs_txflags, as that is what
it effectively is.


# 233908 04-Apr-2012 adrian

Implement BAR TX.

A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission. The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.

In order to do this:

* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
is torn down.

This is a first pass that implements the above. What needs to be done/
tested:

* What happens during say, a channel reset / stuck beacon _and_ BAR
TX. It _should_ be correctly buffered and retried once the
reset has completed. But if a bgscan occurs (and they shouldn't,
grr) the BAR frame will be forcibly failed and the aggregation session
will be torn down.

Yes, another reason to disable bgscan until I've figured this out.

* There's way too much locking going on here. I'm going to do a couple
of further passes of sanitising and refactoring so the (re) locking
isn't so heavy. Right now I'm going for correctness, not speed.

* The BAR TX can fail if the hardware TX queue is full. Since there's
no "free" space kept for management frames, a full TX queue (from eg
an iperf test) can race with your ability to allocate ath_buf/mbufs
and cause issues. I'll knock this on the head with a subsequent
commit.

* I need to do some _much_ more thorough testing in hostap mode to ensure
that many concurrent traffic streams to different end nodes are correctly
handled. I'll find and squish whichever bugs show up here.

But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.


# 233895 04-Apr-2012 adrian

Correctly handle AR_MoreAggr when assembling multi-descriptor final frames.

Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.

Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".

The correct behaviour is:

* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
in an aggregate.

Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.

I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.


# 233673 29-Mar-2012 adrian

Defer the rescheduling of TID -> TXQ frames in some instances.

Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue. It shouldn't
be called outside the taskqueue.

But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.

Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames. But for now, this should suffice for the BAR TX case.

I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.


# 233227 20-Mar-2012 adrian

Delay sequence number allocation for A-MPDU until just before the frame
is queued to the hardware.

Because multiple concurrent paths can execute ath_start(), multiple
concurrent paths can push frames into the software/hardware TX queue
and since preemption/interrupting can occur, there's the possibility
that a gap in time will occur between allocating the sequence number
and queuing it to the hardware.

Because of this, it's possible that a thread will have allocated a
sequence number and then be preempted by another thread doing the same.
If the second thread sneaks the frame into the BAW, the (earlier) sequence
number of the first frame will be now outside the BAW and will result
in the frame being constantly re-added to the tail of the queue.
There it will live until the sequence numbers cycle around again.

This also creates a hole in the RX BAW tracking which can also cause
issues.

This patch delays the sequence number allocation to occur only just before
the frame is going to be added to the BAW. I've been wanting to do this
anyway as part of a general code tidyup but I've not gotten around to it.
This fixes the PR.

However, it still makes it quite difficult to try and ensure in-order
queuing and dequeuing of frames. Since multiple copies of ath_start()
can be run at the same time (eg one TXing process thread, one TX completion
task/one RX task) the driver may end up having frames dequeued and pushed
into the hardware slightly/occasionally out of order.

And, to make matters more annoying, net80211 may have the same behaviour -
in the non-aggregation case, the TX code allocates sequence numbers
before it's thrown to the driver. I'll open another PR to investigate
this and potentially introduce some kind of final-pass TX serialisation
before frames are thrown to the hardware. It's also very likely worthwhile
adding some debugging code into ath(4) and net80211 to catch when/if this
does occur.

PR: kern/166190


# 232794 10-Mar-2012 adrian

Fix a panic introduced in a previous commit - non-beaconing modes (eg STA)
don't setup the avp mcast queue.

This is a bit annoying though - it turns out the mcast queue isn't
initialised for STA mode but it's then touched to see whether anything
is in it. That should be fixed in a subsequent commit.

Noticed by: gperez@entel.upc.edu
PR: kern/165895


# 232764 10-Mar-2012 adrian

Don't flood the cabq/mcastq with frames.

In a very noisy 2.4GHz environment (with HT/40 enabled, making it worse)
I saw the following occur:

* the air was considered "busy" a lot of the time;
* the cabq time is quite short due to staggered beacons being enabled;
* it just wasn't able to keep up TX'ing CABQ frames;
* .. and the cabq would swallow up all the TX ath_buf's.

This patch introduces a twiddle which allows the maximum cabq depth to be
set, forcing further frames to be dropped.

It defaults to the TX buffer count at the moment, so the default behaviour
isn't changed.

I've also started fleshing out a similar setup for the data path, so
it doesn't swallow up all the available TX buffers and preventing management
frames (such as ADDBA) out.

PR: kern/165895


# 232163 25-Feb-2012 adrian

Attempt to further fix some of the concurrency/reset issues that occur.

* ath_reset() is being called in softclock context, which may have the
thing sleep on a lock. To avoid this, since we really _shouldn't_
be sleeping on any locks, break out the no-loss reset path into a tasklet
and call that from:

+ ath_calibrate()
+ ath_watchdog()

This has the added advantage that it'll end up also doing the frame
RX cleanup from within the taskqueue context, rather than the softclock
context.

* Shuffle around the taskqueue_block() call to be before we grab the lock
and disable interrupts.

The trouble here is that taskqueue_block() doesn't block currently
queued (but not yet running) tasks so calling it doesn't guarantee
no further tasks (that weren't running on _A_ CPU at the time of this
call) will complete. Calling taskqueue_drain() on these tasks won't
work because if any _other_ thread calls taskqueue_enqueue() for whatever
reason, everything gets very angry and stops working.

This slightly changes the race condition enough to let ath_rx_tasklet()
run before we try disabling it, and thus quietens the warnings a bit.

The (more) true solution will be doing something like the following:

* having a taskqueue_blocked mask in ath_softc;
* having an interrupt_blocked mask in ath_softc;
* only calling taskqueue_drain() on each individual task _after_ the
lock has been acquired - that way no further tasklet scheduling
is going to occur.
* Then once the tasks have been blocked _and_ the interrupt has been
disabled, call taskqueue_drain() on each, ensuring that anything
that _was_ scheduled or running is removed.

The trouble is if something calls taskqueue_enqueue() on a task
after taskqueue_blocked() has been called but BEFORE taskqueue_drain()
has been called, ta_pending will be set to 1 and taskqueue_drain()
will sit there stuck in msleep() until you hard-kill the machine.

PR: kern/165382
PR: kern/165220


# 231369 10-Feb-2012 adrian

Add in a new driver feature to allow the TX and RX chainmask to be
overridden at attach time.

Some 802.11n NICs may only have one physical antenna connected.
The radios will be very upset if you try enabling radios which aren't
connected to antennas.

This allows hints to override the TX and RX chainmask.

These hints are:

hint.ath.X.rx_chainmask
hint.ath.X.tx_chainmask

They can be set at either boot time or in kenv before the module is loaded.

This and the previous HAL commit were sponsored in late 2011 by Hobnob, Inc.

Sponsored by: Hobnob, Inc.


# 230493 24-Jan-2012 adrian

Fix up some style(9) indenting and reorganise some of the hal methods.

There should be no functional change due to this commit.


# 230492 24-Jan-2012 adrian

Add a missing HAL method macro. I'm using this as part of some personal
DFS radar stuff.


# 228891 26-Dec-2011 adrian

Flesh out configurable hardware based LED blinking.

The hardware (MAC) LED blinking involves a few things:

* Selecting which GPIO pins map to the MAC "power" and "network" lines;
* Configuring the MAC LED state (associated, scanning, idle);
* Configuring the MAC LED blinking type and speed.

The AR5416 HAL configures the normal blinking setup - ie, blink rate based
on TX/RX throughput. The default AR5212 HAL doesn't program in any
specific blinking type, but the default of 0 is the same.

This code introduces a few things:

* The hardware led override is configured via sysctl 'hardled';
* The MAC network and power LED GPIO lines can be set, or left at -1
if needed. This is intended to allow only one of the hardware MUX
entries to be configured (eg for PCIe cards which only have one LED
exposed.)

TODO:

* For AR2417, the software LED blinking involves software blinking the
Network LED. For the AR5416 and later, this can just be configured
as a GPIO output line. I'll chase that up with a subsequent commit.

* Add another software LED blink for "Link", separate from "activity",
which blinks based on the association state. This would make my
D-Link DWA-552 have consistent and useful LED behaviour (as they're
marked "Link" and "Activity."

* Don't expose the hardware LED override unless it's an AR5416 or later,
as the previous generation hardware doesn't have this multiplexing
setup.


# 227651 18-Nov-2011 adrian

Flesh out some slightly dirty reset/channel change serialisation code
for the ath(4) driver.

Currently, there's nothing stopping reset, channel change and general
TX/RX from overlapping with each other. This wasn't a big deal with
pre-11n traffic as it just results in some dropped frames.
It's possible this may have also caused some inconsistencies and
badly-setup hardware.

Since locks can't be held across all of this (the Linux solution)
due to LORs with the network stack locks, some state counter
variables are used to track what parts of the code the driver is
currently in.

When the hardware is being reset, it disables the taskqueue and
waits for pending interrupts, tx, rx and tx completion before
it begins the reset or channel change.

TX and RX both abort if called during an active reset or channel
change.

Finally, the reset path now doesn't flush frames if ATH_RESET_NOLOSS
is set. Instead, completed TX and RX frames are passed back up to
net80211 before the reset occurs.

This is not without problems:

* Raw frame xmit are just dropped, rather than placed on a queue.
The net80211 stack should be the one which queues these frames
rather than the driver.

* It's all very messy. It'd be better if these hardware operations
were serialised on some kind of work queue, rather than hoping
they can be run in parallel.

* The taskqueue block/unblock may occur in parallel with the
newstate() function - which shuts down the taskqueue and restarts
it once the new state is known. It's likely these operations should
be refcounted so the taskqueue is restored once no other areas
in the code wish to suspend operations.

* .. interrupt disable/enable should likely be refcounted as well.

With this work, the driver does not drop frames during stuck beacon
or fatal errors and thus 11n traffic continues to run correctly.
Default and full resets however do still drop frames and it's possible
this may occur, causing traffic loss and session stalls.

Sponsored by: Hobnob, Inc.


# 227346 08-Nov-2011 adrian

Merge in some fixes from the if_ath_tx branch.

* Close down some of the kickpcu races, where the interrupt handler
can and will run concurrently with the taskqueue.
* Close down the TXQ active/completed race between the interrupt
handler and the concurrently running tx completion taskqueue
function.
* Add some tx and rx interrupt count tracking, for debugging.
* Fix the kickpcu logic in ath_rx_proc() to not simply drain and
restart the TX queue - instead, assume the hardware isn't
(too) confused and just restart RX DMA. This may break on
previous chipsets, so if it does I'll add a HAL flag and
conditionally handle this (ie, for broken chipsets, I'll
just restore the "stop PCU / flush things / restart PCU"
logic.)
* Misc stuff

Sponsored by: Hobnob, Inc.


# 227344 08-Nov-2011 adrian

Migrate the STAILQ lists to TAILQs.

A bunch of the 11n TX aggregation logic wants to traverse lists of buffers
in various ways. In order to provide O(1) behaviour in this instance,
use TAILQs.

This does blow out the memory footprint and CPU cycles slightly for some
of these operations. I may convert some of these back to STAILQs once
the rest of the software transmit queue handling has been stabilised.

Sponsored by: Hobnob, Inc.


# 227328 08-Nov-2011 adrian

Begin merging in some of my 802.11n TX aggregation driver changes.

* Add a PCU lock, which isn't currently used but will eventually be
used to serialise some of the driver access.

* Add in all the software TX aggregation state, that's kept per-node
and per-TID.

* Add in the software and aggregation state to ath_buf.

* Add in hooks to ath_softc for aggregation state and the (upcoming)
aggregation TX state calls.

* Add / fix the HAL access macros.

Obtained from: Linux, ath9k
Sponsored by: Hobnob, Inc.


# 225818 28-Sep-2011 adrian

Update the default AIFS value for hostap mode.

Obtained from: Linux ath9k, Atheros reference


# 225444 07-Sep-2011 adrian

Update the TSF and next-TBTT methods to work for the AR5416 and later NICs.
This is another commit in a series of TDMA support fixes for the 11n NICs.

* Move ath_hal_getnexttbtt() into the HAL; write methods for it.
This returns a timer value in TSF, rather than TU.

* Move ath_hal_getcca() and ath_hal_setcca() into the HAL too, where they
likely now belong.

* Create a new HAL capability: HAL_CAP_LONG_RXDESC_TSF.
The pre-11n NICs write 15 bit TSF snapshots into the RX descriptor;
the AR5416 and later write 32 bit TSF snapshots into the RX descriptor.
* Use the new capability to choose between 15 and 31 bit TSF adjustment
functions in ath_extend_tsf().

* Write ar5416GetTsf64() and ar5416SetTsf64() methods.
ar5416GetTsf64() tries to compensate for TSF changes at the 32 bit boundary.

According to yin, this fixes the TDMA beaconing on 11n chipsets and TDMA
stations can now associate/talk, but there are still issues with traffic
stability which need to be investigated.

The ath_hal_extendtsf() function is also used in RX packet timestamping;
this may improve adhoc mode on the 11n chipsets. It also will affect the
timestamps seen in radiotap frames.

Submitted by: Kang Yin Su <cantona@cantona.net>
Approved by: re (kib)


# 224720 08-Aug-2011 adrian

And add another missing brace. Another pointy hat moment.
This one however isn't used by any public code yet, so it
didn't break the build.

Approved by: re (kib, blanket)


# 224715 08-Aug-2011 adrian

.. and add a missing bracket.

Approved by: re (kib, blanket)


# 224714 08-Aug-2011 adrian

Fix method naming to match the reference HAL definition.

Obtained from: Atheros
Approved by: re (kib, blanket)


# 224709 08-Aug-2011 adrian

Add another HAL method - ah_isFastClockEnabled - which returns AH_TRUE
if 5ghz fast clock is enabled in the current operating mode.

It's slightly dirty, but it's part of the reference HAL and used by
the (currently closed-source) radar event code to map radar pulses
back to microsecond durations.

Obtained from: Atheros
Approved by: re (kib, blanket)


# 224588 02-Aug-2011 adrian

Fix a corner case in RXEOL handling which was likely introduced by yours
truly.

Before 802.11n, the RX descriptor list would employ the "self-linked tail
descriptor" trick which linked the last descriptor back to itself.
This way, the RX engine would never hit the "end" of the list and stop
processing RX (and assert RXEOL) as it never hit a descriptor whose next
pointer was 0. It would just keep overwriting the last descriptor until
the software freed up some more RX descriptors and chained them onto the
end.

For 802.11n, this needs to stop as a self-linked RX descriptor tickles the
block-ack logic into ACK'ing whatever frames are received into that
self-linked descriptor - so in very busy periods, you could end up with
A-MPDU traffic that is ACKed but never received by the 802.11 stack.
This would cause some confusion as the ADDBA windows would suddenly
be out of sync.

So when that occured here, the last descriptor would be hit and the PCU
logic would stop. It would only start again when the RX descriptor list
was updated and the PCU RX engine was re-tickled. That wasn't being done,
so RXEOL would be continuously asserted and no RX would continue.

This patch introduces a new flag - sc->sc_kickpcu - which when set,
signals the RX task to kick the PCU after its processed whatever packets
it can. This way completed packets aren't discarded.

In case some other task gets called which resets the hardware, don't
update sc->sc_imask - instead, just update the hardware interrupt mask
directly and let either ath_rx_proc() or ath_reset() restore the imask
to its former setting.

Note: this bug was only triggered when doing a whole lot of frame snooping
with serial console IO in the RX task. This would defer interrupt processing
enough to cause an RX descriptor overflow. It doesn't happen in normal
conditions.

Approved by: re (kib, blanket)


# 224540 31-Jul-2011 adrian

Fix typo!

Approved by: re (kib)


# 222815 07-Jun-2011 adrian

Flesh out a new HAL method to fetch the radar PHY error frame information.

For the AR5211/AR5212, this is apparently a one byte pulse duration
counter value. It is only coded up here for the AR5212 as I don't have
any AR5211-series hardware to test it on.

This information was extracted from the Madwifi DFS branch along with
some local additions.

Please note - all this does is extract out the radar event duration,
it in no way reflects the presence of a radar. Further code is needed
to take a set of radar events and filter them to extract out correct
radar pulse trains (and ignore other events.)

For further information, please see:

http://wiki.freebsd.org/dev/ath_hal%284%29/RadarDetection

This includes references to the relevant patents which describe what
is going on.

Obtained from: Madwifi


# 222668 04-Jun-2011 adrian

A few changes to make radar detection implementable in a hal_dfs/
module.

* If sc->sc_dodfs is set to 1 by the ath_dfs_radar_enable(),
set the relevant rx filter bit to begin receiving radar PHY
errors. The HAL code already knows how to set the relevant
error mask register to enable radar events.

* Add a missing call to ath_dfs_radar_enable() after ath_hal_reset()

* change ath_dfs_process_phyerr() to take a const char *buf for now,
rather than a descriptor. This way it can get access to the packet
buffer contents.


# 222585 01-Jun-2011 adrian

Flesh out the radar detection related operations for the ath driver.

This is in no way a complete DFS/radar detection implementation!
It merely creates an abstracted interface which allows for future
development of the DFS radar detection code.

Note: Net80211 already handles the bulk of the DFS machinery,
all we need to do here is figure out that a radar event has occured
and inform it as such. It then drives the DFS state engine for us.

The "null" DFS radar detection module is included by default;
it doesn't require a device line.

This commit:

* Adds a simple abstracted layer for radar detection state -
sys/dev/ath/ath_dfs/;
* Implements a null DFS module which doesn't do anything;
(ie, implements the exact behaviour at the moment);
* Adds hooks to the ath driver to process received radar events
and gives the DFS module a chance to determine whether
a radar has been detected.

Obtained from: Atheros


# 222277 25-May-2011 adrian

The current ANI capability information uses a different set of
values for the commands, compared to the internal command values
(HAL_ANI_CMD.)

My eventual aim is to make the HAL_ANI_CMD internal enum match
the public API and then remove all this messiness.

This now allows HAL_CAP_INTMIT users to use a public HAL_CAP_INTMIT_
enum rather than magic constants.

The only magic constants currently used by if_ath are "enable" and
"present". Some local tools of mine allow for direct, manual fiddling
of the ANI variables and I'll convert these to use the public enum API
before I commit them.


# 220772 18-Apr-2011 adrian

Add global TX timeout handling.

The global TX timeout counter increments whenever a frame is ready
to be transmitted and the medium is busy.


# 220324 04-Apr-2011 adrian

Add a HAL capability bit for supporting self-linked RX descriptors and disable it for the 11n chipsets.

From the ath9k source:

==

11N: we can no longer afford to self link the last descriptor.
MAC acknowledges BA status as long as it copies frames to host
buffer (or rx fifo). This can incorrectly acknowledge packets
to a sender if last desc is self-linked.

==

Since this is useful for pre-AR5416 chips that communicate PHY errors
via error frames rather than by on-chip counters, leave the support
in there, but disable it for AR5416 and later.


# 220053 27-Mar-2011 adrian

Rename AH_ENABLE_11N to ATH_ENABLE_11 - the HAL supports 11n by
default but the ath driver doesn't. This is a much more consistent
name.


# 220033 26-Mar-2011 adrian

If 802.11n is enabled, bump the number of buffers used up to a larger
level.

This is important for AMPDU RX as each burst is multiple packets in a row.


# 218490 09-Feb-2011 adrian

Expose the 4k transaction workaround hooks to the driver, but don't (yet)
fix the descriptor allocation.


# 218151 01-Feb-2011 adrian

Add TX/RX chainmask info to if_ath - this is needed for the 11n TX rate series.


# 218067 29-Jan-2011 adrian

Fix some errors introduced w/ the last commit; fix setting RTS/CTS in the 11n rate scenario.

* I messed up a couple of things in if_athvar.h; so fix that.
* Undo some guesswork done in ar5416Set11nRateScenario() and introduce a
flags parameter which lets the caller set a few things. To begin with,
this includes whether to do RTS or CTS protection.
* If both RTS and CTS is set, only do RTS. Both RTS and CTS shouldn't be
set on a frame.


# 218066 29-Jan-2011 adrian

Link in the 11n specific TX methods into the HAL.


# 217684 21-Jan-2011 adrian

ANI changes #1 - split out the ANI polling from the RxMonitor hook.

The rxmonitor hook is called on each received packet. This can get very,
very busy as the tx/rx/chanbusy registers are thus read each time a packet
is received.

Instead, shuffle out the true per-packet processing which is needed and move
the rest of the ANI processing into a periodic event which runs every 100ms
by default.


# 217627 20-Jan-2011 adrian

Add in the public method to access the tx completion rates.


# 217624 20-Jan-2011 adrian

Include the initial support for external EEPROMs.

The AR9100 at least doesn't have an external serial EEPROM
attached to the MAC; it instead stores the calibration data
in the normal system flash.

I believe earlier parts can do something similar but I haven't
experienced it first-hand.

This commit introduces an eepromdata pointer into the API but
doesn't at all commit to using it. A future commit will
include the glue needed to allow the AR9100 support code
to use this data pointer as the EEPROM.


# 203683 08-Feb-2010 rpaulo

Add multicast key search support. This fixes corrupted mcast packets
when we have more than one hostap vap.

Submitted by: Russell Yount <russell.yount at gmail.com>
MFC after: 2 weeks


# 195807 21-Jul-2009 sam

track whether any mesh vaps are present to correctly setup the rx filter
when, for example, an ap vap is created first

Reviewed by: rpaulo
Approved by: re (kib)


# 195618 11-Jul-2009 rpaulo

Implementation of the upcoming Wireless Mesh standard, 802.11s, on the
net80211 wireless stack. This work is based on the March 2009 D3.0 draft
standard. This standard is expected to become final next year.
This includes two main net80211 modules, ieee80211_mesh.c
which deals with peer link management, link metric calculation,
routing table control and mesh configuration and ieee80211_hwmp.c
which deals with the actually routing process on the mesh network.
HWMP is the mandatory routing protocol on by the mesh standard, but
others, such as RA-OLSR, can be implemented.

Authentication and encryption are not implemented.

There are several scripts under tools/tools/net80211/scripts that can be
used to test different mesh network topologies and they also teach you
how to setup a mesh vap (for the impatient: ifconfig wlan0 create
wlandev ... wlanmode mesh).

A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled
by default on GENERIC kernels for i386, amd64, sparc64 and pc98.

Drivers that support mesh networks right now are: ath, ral and mwl.

More information at: http://wiki.freebsd.org/WifiMesh

Please note that this work is experimental. Also, please note that
bridging a mesh vap with another network interface is not yet supported.

Many thanks to the FreeBSD Foundation for sponsoring this project and to
Sam Leffler for his support.
Also, I would like to thank Gateworks Corporation for sending me a
Cambria board which was used during the development of this project.

Reviewed by: sam
Approved by: re (kensmith)
Obtained from: projects/mesh11s


# 195114 27-Jun-2009 sam

Add HAL_RX_FILTER_BSSID support (to disable bssid match):
o add HAL_CAP_BSSIDMATCH to identify parts that have the support for
disabling bssid match
o honor capability for set/get rx filter
o use HAL_CAP_BSSIDMATCH in driver to decide whether to use the bssid
match disable or fall back to promisc mode

Reviewed by: rpaulo
Approved by: re (rwatson)


# 192468 20-May-2009 sam

Overhaul monitor mode handling:
o replace DLT_IEEE802_11 support in net80211 with DLT_IEEE802_11_RADIO
and remove explicit bpf support from wireless drivers; drivers now
use ieee80211_radiotap_attach to setup shared data structures that
hold the radiotap header for each packet tx/rx
o remove rx timestamp from the rx path; it was used only by the tdma support
for debugging and was mostly useless due to it being 32-bits and mostly
unavailable
o track DLT_IEEE80211_RADIO bpf attachments and maintain per-vap and
per-com state when there are active taps
o track the number of monitor mode vaps
o use bpf tap and monitor mode vap state to decide when to collect radiotap
state and dispatch frames; drivers no longer explicitly directly check
bpf state or use bpf calls to tap frames
o handle radiotap state updates on channel change in net80211; drivers
should not do this (unless they bypass net80211 which is almost always
a mistake)
o update various drivers to be more consistent/correct in handling radiotap
o update ral to include TSF in radiotap'd frames
o add promisc mode callback to wi

Reviewed by: cbzimmer, rpaulo, thompsa


# 190848 08-Apr-2009 sam

remove unused struct member


# 190579 30-Mar-2009 sam

Hoist 802.11 encapsulation up into net80211:
o call ieee80211_encap in ieee80211_start so frames passed down to drivers
are already encapsulated
o remove ieee80211_encap calls in drivers
o fixup wi so it recreates the 802.3 head it requires from the 802.11
header contents
o move fast-frame aggregation from ath to net80211 (conditional on
IEEE80211_SUPPORT_SUPERG):
- aggregation is now done in ieee80211_start; it is enabled when the
packets/sec exceeds ieee80211_ffppsmin (net.wlan.ffppsmin) and frames
are held on a staging queue according to ieee80211_ffagemax
(net.wlan.ffagemax) to wait for a frame to combine with
- drivers must call back to age/flush the staging queue (ath does this
on tx done, at swba, and on rx according to the state of the tx queues
and/or the contents of the staging queue)
- remove fast-frame-related data structures from ath
- add ieee80211_ff_node_init and ieee80211_ff_node_cleanup to handle
per-node fast-frames state (we reuse 11n tx ampdu state)
o change ieee80211_encap calling convention to include an explicit vap
so frames coming through a WDS vap are recognized w/o setting M_WDS

With these changes any device able to tx/rx 3Kbyte+ frames can use fast-frames.

Reviewed by: thompsa, rpaulo, avatar, imp, sephe


# 190571 30-Mar-2009 sam

Remove ATH_SUPPORT_TDMA and use IEEE80211_SUPPORT_TDMA instead. It
doesn't make much sense to configure driver support w/o net80211.
Note this means ath now depends on opt_wlan.h.


# 190096 19-Mar-2009 sam

purge hal abi support; now that the hal is merged w/ the driver
we cannot be out of sync

MFC after: 1 week


# 189605 09-Mar-2009 sam

replace if_watchdog w/ private callout; probably can merge this with the
calibration work sometime in the future


# 189380 04-Mar-2009 sam

add a sysctl to ena/dis frobbing cca


# 188974 23-Feb-2009 sam

5416 and later parts mux the gpio outputs; extend the api to include
a signal type that's used to select the appropriate mux


# 188783 19-Feb-2009 sam

remove private support for IEEE80211_MODE_HALF and IEEE80211_MODE_QUARTER
now that net80211 has them


# 187831 28-Jan-2009 sam

Overhaul regulatory support:
o remove HAL_CHANNEL; convert the hal to use net80211 channels; this
mostly involves mechanical changes to variable names and channel
attribute macros
o gut HAL_CHANNEL_PRIVATE as most of the contents are now redundant
with the net80211 channel available
o change api for ath_hal_init_channels: no more reglass id's, no more outdoor
indication (was a noop), anM contents
o add ath_hal_getchannels to have the hal construct a channel list without
altering runtime state; this is used to retrieve the calibration list for
the device in ath_getradiocaps
o add ath_hal_set_channels to take a channel list and regulatory data from
above and construct internal state to match (maps frequencies for 900MHz
cards, setup for CTL lookups, etc)
o compact the private channel table: we keep one private channel
per frequency instead of one per HAL_CHANNEL; this gives a big
space savings and potentially improves ani and calibration by
sharing state (to be seen; didn't see anything in testing); a new config
option AH_MAXCHAN controls the table size (default to 96 which
was chosen to be ~3x the largest expected size)
o shrink ani state and change to mirror private channel table (one entry per
frequency indexed by ic_devdata)
o move ani state flags to private channel state
o remove country codes; use net80211 definitions instead
o remove GSM regulatory support; it's no longer needed now that we
pass in channel lists from above
o consolidate ADHOC_NO_11A attribute with DISALLOW_ADHOC_11A
o simplify initial channel list construction based on the EEPROM contents;
we preserve country code support for now but may want to just fallback
to a WWR sku and dispatch the discovered country code up to user space
so the channel list can be constructed using the master regdomain tables
o defer to net80211 for max antenna gain
o eliminate sorting of internal channel table; now that we use ic_devdata
as an index, table lookups are O(1)
o remove internal copy of the country code; the public one is sufficient
o remove AH_SUPPORT_11D conditional compilation; we always support 11d
o remove ath_hal_ispublicsafetysku; not needed any more
o remove ath_hal_isgsmsku; no more GSM stuff
o move Conformance Test Limit (CTL) state from private channel to a lookup
using per-band pointers cached in the private state block
o remove regulatory class id support; was unused and belongs in net80211
o fix channel list construction to set IEEE80211_CHAN_NOADHOC,
IEEE80211_CHAN_NOHOSTAP, and IEEE80211_CHAN_4MSXMIT
o remove private channel flags CHANNEL_DFS and CHANNEL_4MS_LIMIT; these are
now set in the constructed net80211 channel
o store CHANNEL_NFCREQUIRED (Noise Floor Required) channel attribute in one
of the driver-private flag bits of the net80211 channel
o move 900MHz frequency mapping into the hal; the mapped frequency is stored
in the private channel and used throughout the hal (no more mapping in the
driver and/or net80211)
o remove ath_hal_mhz2ieee; it's no longer needed as net80211 does the
calculation and available in the net80211 channel
o change noise floor calibration logic to work with compacted private channel
table setup; this may require revisiting as we no longer can distinguish
channel attributes (e.g. 11b vs 11g vs turbo) but since the data is used
only to calculate status data we can live with it for now
o change ah_getChipPowerLimits internal method to operate on a single channel
instead of all channels in the private channel table
o add ath_hal_gethwchannel to map a net80211 channel to a h/w frequency
(always the same except for 900MHz channels)
o add HAL_EEBADREG and HAL_EEBADCC status codes to better identify regulatory
problems
o remove CTRY_DEBUG and CTRY_DEFAULT enum's; these come from net80211 now
o change ath_hal_getwirelessmodes to really return wireless modes supported
by the hardware (was previously applying regulatory constraints)
o return channel interference status with IEEE80211_CHANSTATE_CWINT (should
change to a callback so hal api's can take const pointers)
o remove some #define's no longer needed with the inclusion of
<net80211/_ieee80211.h>

Sponsored by: Carlson Wireless


# 186904 08-Jan-2009 sam

TDMA support for long distance point-to-point links using ath devices:
o add net80211 support for a tdma vap that is built on top of the
existing adhoc-demo support
o add tdma scheduling of frame transmission to the ath driver; it's
conceivable other devices might be capable of this too in which case
they can make use of the 802.11 protocol additions etc.
o add minor bits to user tools that need to know: ifconfig to setup and
configure, new statistics in athstats, and new debug mask bits

While the architecture can support >2 slots in a TDMA BSS the current
design is intended (and tested) for only 2 slots.

Sponsored by: Intel


# 185744 07-Dec-2008 sam

New periodic calibration scheme needed for 11n parts that have
multiple algorithms and potentially collect multiple samples.
Instead of a single calibration interval we now have short and long
intervals; the long interval roughly corresponds to the previous
single interval. The short interval is used to speedup collection
of samples and happens much quicker. We make calls using the short
interval until we're told the calibration work is complete at which
point we fallback to the long interval. In addition there is a
much longer reset interval used to flush all calibration state and
cause everthing to start anew.

With these changes you can also disable calibration entirely by
setting the long interval to zero.


# 185522 01-Dec-2008 sam

Switch to ath hal source code. Note this removes the ath_hal
module; the ath module now brings in the hal support. Kernel
config files are almost backwards compatible; supplying

device ath_hal

gives you the same chip support that the binary hal did but you
must also include

options AH_SUPPORT_AR5416

to enable the extended format descriptors used by 11n parts.
It is now possible to control the chip support included in a
build by specifying exactly which chips are to be supported
in the config file; consult ath_hal(4) for information.


# 185242 23-Nov-2008 sam

nuke special handling of RXORN interrupt; the hal marks the FATAL
bit in the interrupt status when RXORN is hit and the chip requires
a reset so our special handling was causing useless resets


# 184369 27-Oct-2008 sam

prepare for a new hal


# 184368 27-Oct-2008 sam

o With the addition of HT rates the set of h/w codes has a much wider range
making the use of sc_hwmap to do direct mapping impractical. Switch to
indexing by the rate index instead of the rate code and adjust associated
state and logic appropriately. This has several benefits including
simplification of the led code.
o fix radiotap capture of HT rates
o fix conditional compilation of HT radiotap support to be based on the
hal having 5416 support; not the ABI version as hal builds may or may
not include 5416 support


# 184358 27-Oct-2008 sam

Fixup statistics:
o update tx rssi data only when an ACK was received
o return tx rssi from sampled data instead of the last frame
o track noise floor
o return rx rssi and noise floor (was broken)


# 184354 27-Oct-2008 sam

add sys.dev.ath.X.intmit knob to enable/disable ANI
(the intmit name is historical)


# 184351 27-Oct-2008 sam

rename bf_flags to bf_txflags in preparation for the addition of flags
separate from the tx descriptor flags currently recorded


# 184347 27-Oct-2008 sam

remove driver-private equivalent of ni_txparms; it's now superfluous


# 183221 20-Sep-2008 sam

fix memory smash on lp64 platforms; mostly noticeable in user mode
as being unable to associate


# 182893 09-Sep-2008 rpaulo

Update for new HAL.

Reviewed by: sam


# 179401 28-May-2008 sam

Cleanup power handling and fix suspend/resume:
o do not put the chip into full sleep in ath_stop as it gains
nothing and causes many parts to hang in ath_detach because we
may touch the chip during vap teardown; this may also fix issues
with unloading the module
o add a note in ath_detach to explain ath_hal_detach puts the
chip in low power mode; this is useful to know as it means
unloading the module will place a pci device in the lowest
possible power state
o leave an #ifdef notyet marker for powering down the chip when
a device is marked down; we can't do that until we handle all
the ways the driver may be entered and touch the chip
o fix resume by reloading the h/w key cache as it's been clobbered
(for pci) by the socket being powered off; for station mode we
directly stop+init the chip and then simulate a beacon miss to
get the upper layers sync'd up; for other configs we must brute
force stop+start the vaps so they go through the state machine


# 178751 03-May-2008 sam

add back sysctl's to display the regdomain and country code from eeprom;
useful for debugging


# 178354 20-Apr-2008 sam

Multi-bss (aka vap) support for 802.11 devices.

Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral). Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.

Supported by: Hobnob and Marvell
Reviewed by: many
Obtained from: Atheros (some bits)


# 170530 11-Jun-2007 sam

Update 802.11 wireless support:
o major overhaul of the way channels are handled: channels are now
fully enumerated and uniquely identify the operating characteristics;
these changes are visible to user applications which require changes
o make scanning support independent of the state machine to enable
background scanning and roaming
o move scanning support into loadable modules based on the operating
mode to enable different policies and reduce the memory footprint
on systems w/ constrained resources
o add background scanning in station mode (no support for adhoc/ibss
mode yet)
o significantly speedup sta mode scanning with a variety of techniques
o add roaming support when background scanning is supported; for now
we use a simple algorithm to trigger a roam: we threshold the rssi
and tx rate, if either drops too low we try to roam to a new ap
o add tx fragmentation support
o add first cut at 802.11n support: this code works with forthcoming
drivers but is incomplete; it's included now to establish a baseline
for other drivers to be developed and for user applications
o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates
prepending mbufs for traffic generated locally
o add support for Atheros protocol extensions; mainly the fast frames
encapsulation (note this can be used with any card that can tx+rx
large frames correctly)
o add sta support for ap's that beacon both WPA1+2 support
o change all data types from bsd-style to posix-style
o propagate noise floor data from drivers to net80211 and on to user apps
o correct various issues in the sta mode state machine related to handling
authentication and association failures
o enable the addition of sta mode power save support for drivers that need
net80211 support (not in this commit)
o remove old WI compatibility ioctls (wicontrol is officially dead)
o change the data structures returned for get sta info and get scan
results so future additions will not break user apps
o fixed tx rate is now maintained internally as an ieee rate and not an
index into the rate set; this needs to be extended to deal with
multi-mode operation
o add extended channel specifications to radiotap to enable 11n sniffing

Drivers:
o ath: add support for bg scanning, tx fragmentation, fast frames,
dynamic turbo (lightly tested), 11n (sniffing only and needs
new hal)
o awi: compile tested only
o ndis: lightly tested
o ipw: lightly tested
o iwi: add support for bg scanning (well tested but may have some
rough edges)
o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data
o wi: lightly tested

This work is based on contributions by Atheros, kmacy, sephe, thompsa,
mlaier, kevlo, and others. Much of the scanning work was supported by
Atheros. The 11n work was supported by Marvell.


# 170375 06-Jun-2007 sam

update copyrights to 2007 and convert to be 2-clause bsd-only


# 167252 05-Mar-2007 sam

Change mtx's to use the formulated name as type so witness does not
complain on nested tx q lock acquisitions when processing the cab q.

MFC after: 2 weeks


# 166954 24-Feb-2007 sam

set the antenna switch when fixing the tx antenna using the
dev.ath.X.txantenna sysctl; this is typically what folks
want but beware this has the side effect of disabling rx
diversity

MFC after: 2 weeks


# 166016 15-Jan-2007 sam

add compat shim for ath_hal_isgsmsku until the new hal gets committed


# 166013 14-Jan-2007 sam

Add initial support for 900MHz cards like the Ubiquiti SR9:
o eliminate assumptions that half/quarter rate channels on exist in 11a
o handle frequency mapping between hal and net80211; hal gives us freq's
in the range 2422..2437 that we remap

MFC after: 1 month


# 165571 27-Dec-2006 sam

Add half/quarter rate 11a channel support:
o change handling of regdomain-related mib knobs so they can be set
post-attach: regdomain, countrycode, outdoor, and xchanmode; the
hal will not permit changing the regdomain but we expose it for now
o on regdomain/countrycode change recalculate the channel list and
push it to the net80211 layer (NB: looks to need more tweaking)
o setup rate tables for half/quarter rate channels
o honor half/quarter rate channel configs when changing channels
o honor half/quarter rate channel configs when setting the slot time
o use hack/nonstandard channel numbering scheme for the public safety
band to avoid overlapping 2.4G channels on dual-band cards
o remove setup of ic_sup_rates; the net80211 layer can do this for us
and it simplifies handling of half/quarter rate channels

Tested only in Public Safety Band with cards that have RF5112.


# 165185 13-Dec-2006 sam

Track v0.9.20.3 hal:

o no more ds_vdata in tx/rx descriptors
o split h/w tx/rx descriptor from s/w status
o as part of the descriptor split change the rate control module api
so the ath_buf is passed in to the module so it can fetch both
descriptor and status information as needed
o add some const poisoning

Also for sample rate control algorithm:

o split debug msgs (node, rate, any)
o uniformly bounds check rate indices (and in some cases correct checks)
o move array index ops to after bounds checking
o use final tsi from the status block instead of the h/w descriptor
o replace h/w descriptor struct's with proper mask+shift defs (this
doesn't belong here; everything is known by the driver and should
just be sent down so there's no h/w-specific knowledge)

MFC after: 1 month


# 163187 09-Oct-2006 sam

correct diag request to fetch isr state on fatal interrupts

MFC after: 1 week


# 162410 18-Sep-2006 sam

Add support for newer parts that do not require separate keycache
entries for tx+rx mic keys. This requires a newer hal, but works
fine with the current hal in cvs.

MFC after: 2 weeks


# 162409 18-Sep-2006 sam

remove stub radar support; it's never been used and future
hal's will not include the calls (due to redesign)

MFC after: 1 week


# 161425 17-Aug-2006 imp

while (0); -> while (0) in multi-line macros


# 159938 26-Jun-2006 sam

Close race in handling mcast traffic when operating as an ap with
stations in power save: add a new q where mcast frames are stashed
and on beacon update (at DTIM) move frames from the mcast q to the
cabq and start it. This ensures the cabq is only manipulated in
one place.

Sponsored by: Hobnob
MFC after: 2 weeks


# 159290 05-Jun-2006 sam

move hal bus+tag externalization to the bus glue code where it belongs;
this is a noop on all current freebsd architectures

MFC after: 1 month


# 158298 05-May-2006 sam

correct type

MFC after: 2 weeks


# 156073 27-Feb-2006 sam

backout 1.136 until we can resolve report that it causes output to stall


# 155991 24-Feb-2006 sam

fix a race whereby a tx descriptor might get reused before the hardware
is finished with it; this may only occur when the tx queue is setup as
dba-gated but since the fix is cheap apply it to all queues

while here make the queue depth signed for use in assertions

Reviewed by: apatti
MFC after: 2 weeks


# 155732 15-Feb-2006 sam

o handle fatal errors directly instead of via the task queue
o temporarily dump some h/w state for diagnosis; this will be
removed once some issues are resolved

MFC after: 2 weeks


# 155515 10-Feb-2006 sam

Update for rev 0.9.16.16 hal:
o add dfs+radar hooks; DFS is presently disabled in the hal
o channel and mode handling changes
o various api changes
o be more aggressive about iq calibration settling so ap mode
operation is better immediately after startup
o rfkill/rfsilent sysctl support
o tpc ack/cts sysctl support

MFC after: 2 weeks


# 155496 09-Feb-2006 sam

Beacon timer setup fixes:
o pull nexttbtt forward in adhoc mode too
o resync beacon timers on joining a bss or ibss as the tstamp we
collected while scanning is almost certainly out of date

Note we may need to refine the ibss mode check in ath_recv_mgmt.

Reviewed by: avatar, dyoung
Obtained from: atheros
MFC after: 2 weeks


# 155492 09-Feb-2006 sam

Phantom beacon miss workaround: track the tsf of the last received
frame and if we get a beacon miss interrupt ignore it if we've received
a frame within the beacon miss interval. This should never trigger
and the handling at the net80211 layer should likewise deal with this
but it doesn't hurt and can suppress extranous probe request frames.
Note that we can legtimately get a bmiss when under heavy load.

MFC after: 2 weeks


# 155491 09-Feb-2006 sam

use a private task queue thread

MFC after: 2 weeks


# 155490 09-Feb-2006 sam

add adhoc demo mode support

MFC after: 2 weeks


# 155489 09-Feb-2006 sam

make regdomain sysctl r/w in case it's possible to do this in the future

MFC after: 2 weeks


# 155486 09-Feb-2006 sam

add tx99 hooks

MFC after: 2 weeks


# 155485 09-Feb-2006 sam

move hal statistics to softc; the per-node stats are overkill, they're
only used when operating in station mode

MFC after: 2 weeks


# 155483 09-Feb-2006 sam

honor net80211 mcast tx rate

MFC after: 2 weeks


# 155482 09-Feb-2006 sam

craft unique names for tx q + buffer mtx's to help with interpreting ktr data

MFC after: 2 weeks


# 155481 09-Feb-2006 sam

allow the size of tx+rx buffer pools to be tuned

MFC after: 2 weeks


# 155480 09-Feb-2006 sam

lower try count on mgt (and ctl) frames to avoid clogging the tx queue
and loading the bss when operating in ap mode under load; adjust recognition
of multi-rate retry to match

MFC after: 2 weeks


# 155477 09-Feb-2006 sam

move mgt frame tx rate responsibility from the rate control modules
to the driver; this avoids redundant logic and will be necessary
for future additions

MFC after: 2 weeks


# 154140 09-Jan-2006 sam

Update monitoring support:
o record tsf in tx+rx frames
o switch from raw rssi to dbm for signal data and record both
signal and noise floor data (hacked for now to assume a fixed
noise floor; is correct with new hal)
o add monpass sysctl to control which rx'd frames are passed
up with errors; especially useful to see frames with CRC errors
o mark 'd packets w/ a CRC error with radiotap's BADFCS flag

Also add placeholder code for calibrating the noise floor when
using newer hals.

Reviewed by: avatar
MFC after: 1 week


# 152448 15-Nov-2005 sam

nuke special handling to extend cts when bursting; it was race prone

MFC after: 7 days


# 148863 08-Aug-2005 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks


# 148362 24-Jul-2005 sam

o fix setup of sc_diversity; the hal does not give us reliable
status after attach, only after a reset
o when setting diversity via the sysctl don't update sc_diversity
until we know the hal requested worked
o while here eliminate sc_hasdiversity and sc_hastpc; just query
the hal each time since these are the only places we need to know

MFC after: 3 days


# 147803 06-Jul-2005 sam

only invoke ath_rate_tx_complete to update rate control state when the
frame being sent is to be ack'd and hasn't been filtered by the h/w;
this insures we don't pass in tx descriptors that have no meaningful
state (e.g. mcast/bcast frames are not acked and so have no tx retry
counts)

Approved by: re (scottl)
Obtained from: Atheros


# 147256 10-Jun-2005 brooks

Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.

Reviewed by: sobomax, sam


# 147067 06-Jun-2005 sam

Set the correct IFS parameters for the beacon tx queue
when operating in ap and adhoc modes.


# 147057 06-Jun-2005 sam

Misc keycache changes:
o purge ath_initkeytable; it's not needed
o add multicast key search support for supporting multiple group keys
(disabled for now; requires updated hal)
o create keycache entry for stations using open auth so they get h/w
antenna management support
o add keycache -> node mapping table; eliminates mac-based lookup in
the net80211 layer


# 144961 12-Apr-2005 sam

honor new IEEE80211_KEY_GROUP key flag

Reviewed by: Tai-hwa Liang


# 144617 04-Apr-2005 sam

use frame type returned by ieee80211_input to drive softled code
instead of monitoring the input packet count


# 144346 30-Mar-2005 sam

o extend cts to cover packet burst when operating in 11g w/ protection
o check current channel parameters, not shadow state, for acm policy
on data frames


# 140761 24-Jan-2005 sam

Fixup radiotap handling of FCS and QoS frames per discussion with David Young:
o mark rx frames including FCS in the payload with the
IEEE80211_RADIOTAP_F_FCS flag
o remove hack to copy 802.11 headers with padding out of line; instead mark
the frames with IEEE80211_RADIOTAP_F_DATAPAD and require applications to
do the work
o split precalculated radiotap flags into tx+rx now that they can be different

Note the full usefulness of these changes depends on updates to applications
that process radiotap data.


# 140438 18-Jan-2005 sam

adjust tx buffer allocation based on empirical testing:
o increase the max per-frame tx descriptor count and the number of tx
buffers for forthcoming fast frame support
o correct the max scatter/gather count; it cannot be larger than the
max(tx,rx,beacon) descriptor counts


# 140432 18-Jan-2005 sam

better led blinking


# 139530 31-Dec-2004 sam

bump copyright for 2005


# 139500 31-Dec-2004 sam

Radiotap fixups:
o catch one place where we were not using ath_chan_change to
switch channels; this fixes a problem where the channel
settings were not being correctly reported in captured packets
o return unique channel identification in the channel flags;
ethereal gets confused if you return merged flags (e.g. ofdm,
cck, and 2Ghz) (this is workaround and should be removed if
we can ever cleanup radiotap consumers)
o correct short/long preamble flag state for rx and treat tx
the same--use a new hwflags array that gives us the data
based on the h/w rate index/cookie
o add gross hack to handle radiotap capture of frames that
come in with hardware padding; should be replaced by a
flag in the radiotap header and more smarts in the apps
that decode radiotap data


# 138570 08-Dec-2004 sam

Update with last year of work.


# 127784 03-Apr-2004 sam

do proper subclassing of node free+copy; the previous hack falls apart when
the 802.11 layer does useful work

Obtained from: madwifi


# 127781 02-Apr-2004 sam

transmit beacon frames directly instead of defering them to a swi; there
was too much delay

Obtained from: madwifi


# 127780 02-Apr-2004 sam

update copyright notice for 2004


# 127698 31-Mar-2004 sam

radiotap updates:

o force little-endian byte order for header
o pad header to 32-bit boundary to guard against applications that assume
packet data alignment


# 123044 28-Nov-2003 sam

o track API change for HAL v0.9.6.1
o fix race condition when processing rx descriptors: because we use
a self-linked descriptor at the end of the rx descriptor list to
avoid rx overruns (which can easily happen for 5212 parts that enable
PHY errors) we must carefully check that a descriptor is "done" by
looking ahead to the next descriptor before believing the done bit
in the current descriptor (this is all handled in the HAL since the
rx descriptor format is chip-specific so we need to pass in two
additional parameters--the physical address of the current descriptor
and the virtual address of the next descriptor in the list)
o check copyout return status for SIOCGATHSTATS ioctl

Approved by: re (scottl)


# 121100 14-Oct-2003 sam

o convert mutex calls to #defines for portability, etc.
o destroy mutex's on detach (was missing)


# 120105 15-Sep-2003 sam

Maintain a history of data associated with received frames and use this to
calculate smoothed signal quality data for each node.

o add a 16-deep history buffer to each driver-private node storage that
holds rssi and antenna info for received frames
o override the default per-node "get rssi" method to return an average
rssi value based on samples collected over the last second
o enable beacon reception so even idle systems maintain a running history
of signal quality

This data may also be useful for improving the rate control algorithm.
Based on work by Tom Marshall <tommy@home.tig-grr.com> for MADWIFI.


# 120071 14-Sep-2003 sam

o mark the device capable of short preamble (meaningless for the 5210 but
safe since the 802.11 layer does the right thing for 11a operation)
o select short preamble operation based on the negotiated capabilities; not
just the local state/capability
o fillin the duration field in the 802.11 header as appropriate
o remove detection of 11g support; no longer needed

Obtained from: MADWIFI (with modifications)


# 119783 05-Sep-2003 sam

Add support for the experimental radiotap capture format. With this
we no longer need the debugging code to dump packets.


# 119150 19-Aug-2003 sam

MFp4 changes to fix locking issues and correct reference
count handling of station entries in hostap mode:

Input path:

o driver is now expected to find the node associated with the
sender of a received frame; use ic_bss if none is located
o driver passes the (referenced) node into ieee80211_input for
use within the wlan module and is responsible for cleaning up
on return
o the antenna state is no longer passed up with each frame; this
is now considered driver-private state and drivers are responsible
for keeping it in the driver-private part of a node

Output path:

Revamp output path for management frames to eliminate redundant
locking that causes problems and to correct reference counting
bogosity that occurs when stations are timed out due to inactivity
(in AP mode). On output the refcnt'd node is stashed in the pkthdr's
recvif field (yech) and retrieved by the driver. This eliminates
an unref/ref scenario and related node table unlock/lock due to the
driver looking up the node. This is particularly important when
stations are timed out as this causes a lock order reversal that
can result in a deadlock. As a byproduct we also reduce the overhead
for sending management frames (minimal). Additional fallout from
this is a change to ieee80211_encap to return a refcn't node for
tieing to the outbound frame. Node refcnts are not reclaimed until
after a frame is completely processed (e.g. in the tx interrupt
handler). This is especially important for timed out stations as
this deref will be the final one causing the node entry to be
reclaimed.

Additional semi-related changes:
o replace m_copym use with m_copypacket (optimization)
o add assert to verify ic_bss is never free'd during normal operation
o add comments explaining calling conventions by drivers for frames
going in each direction
o remove extraneous code that "cannot be executed" (e.g. because
pointers may never be null)


# 119144 19-Aug-2003 sam

maintain a table for mapping hardware rate codes to 802.11 rates for
calculating the rate for each rx'd frame


# 117812 20-Jul-2003 sam

track changes to 802.11 code:

o override new_state method per new model
o use ieee80211_state_name instead of private copy


# 117516 13-Jul-2003 sam

o add read-only sysctls to view regulatory domain, country code, and
outdoor use controls
o use sysctl-visible values in setting up channel list


# 116743 23-Jun-2003 sam

Atheros 802.11 driver. Requires Atheros Hardware Access Lay (HAL).

Supported by: Atheros Comunications