Cross Reference: /freebsd-11-stable/sys/dev/ath/if

History log of /freebsd-11-stable/sys/dev/ath/if_athvar.h
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 331722	29-Mar-2018	eadler	Revert r330897: This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code. Revert with prejudice. This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes. Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes. Requested by: gjb (re)
# 330897	14-Mar-2018	eadler	Partial merge of the SPDX changes These changes are incomplete but are making it difficult to determine what other changes can/should be merged. No objections from: pfg
# 302408	07-Jul-2016	gjb	Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here. Additional commits post-branch will follow. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-11-stable/MAINTAINERS /freebsd-11-stable/cddl /freebsd-11-stable/cddl/contrib/opensolaris /freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print /freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs /freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs /freebsd-11-stable/contrib/amd /freebsd-11-stable/contrib/apr /freebsd-11-stable/contrib/apr-util /freebsd-11-stable/contrib/atf /freebsd-11-stable/contrib/binutils /freebsd-11-stable/contrib/bmake /freebsd-11-stable/contrib/byacc /freebsd-11-stable/contrib/bzip2 /freebsd-11-stable/contrib/com_err /freebsd-11-stable/contrib/compiler-rt /freebsd-11-stable/contrib/dialog /freebsd-11-stable/contrib/dma /freebsd-11-stable/contrib/dtc /freebsd-11-stable/contrib/ee /freebsd-11-stable/contrib/elftoolchain /freebsd-11-stable/contrib/elftoolchain/ar /freebsd-11-stable/contrib/elftoolchain/brandelf /freebsd-11-stable/contrib/elftoolchain/elfdump /freebsd-11-stable/contrib/expat /freebsd-11-stable/contrib/file /freebsd-11-stable/contrib/gcc /freebsd-11-stable/contrib/gcclibs/libgomp /freebsd-11-stable/contrib/gdb /freebsd-11-stable/contrib/gdtoa /freebsd-11-stable/contrib/groff /freebsd-11-stable/contrib/ipfilter /freebsd-11-stable/contrib/ldns /freebsd-11-stable/contrib/ldns-host /freebsd-11-stable/contrib/less /freebsd-11-stable/contrib/libarchive /freebsd-11-stable/contrib/libarchive/cpio /freebsd-11-stable/contrib/libarchive/libarchive /freebsd-11-stable/contrib/libarchive/libarchive_fe /freebsd-11-stable/contrib/libarchive/tar /freebsd-11-stable/contrib/libc++ /freebsd-11-stable/contrib/libc-vis /freebsd-11-stable/contrib/libcxxrt /freebsd-11-stable/contrib/libexecinfo /freebsd-11-stable/contrib/libpcap /freebsd-11-stable/contrib/libstdc++ /freebsd-11-stable/contrib/libucl /freebsd-11-stable/contrib/libxo /freebsd-11-stable/contrib/llvm /freebsd-11-stable/contrib/llvm/projects/libunwind /freebsd-11-stable/contrib/llvm/tools/clang /freebsd-11-stable/contrib/llvm/tools/lldb /freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump /freebsd-11-stable/contrib/llvm/tools/llvm-lto /freebsd-11-stable/contrib/mdocml /freebsd-11-stable/contrib/mtree /freebsd-11-stable/contrib/ncurses /freebsd-11-stable/contrib/netcat /freebsd-11-stable/contrib/ntp /freebsd-11-stable/contrib/nvi /freebsd-11-stable/contrib/one-true-awk /freebsd-11-stable/contrib/openbsm /freebsd-11-stable/contrib/openpam /freebsd-11-stable/contrib/openresolv /freebsd-11-stable/contrib/pf /freebsd-11-stable/contrib/sendmail /freebsd-11-stable/contrib/serf /freebsd-11-stable/contrib/sqlite3 /freebsd-11-stable/contrib/subversion /freebsd-11-stable/contrib/tcpdump /freebsd-11-stable/contrib/tcsh /freebsd-11-stable/contrib/tnftp /freebsd-11-stable/contrib/top /freebsd-11-stable/contrib/top/install-sh /freebsd-11-stable/contrib/tzcode/stdtime /freebsd-11-stable/contrib/tzcode/zic /freebsd-11-stable/contrib/tzdata /freebsd-11-stable/contrib/unbound /freebsd-11-stable/contrib/vis /freebsd-11-stable/contrib/wpa /freebsd-11-stable/contrib/xz /freebsd-11-stable/crypto/heimdal /freebsd-11-stable/crypto/openssh /freebsd-11-stable/crypto/openssl /freebsd-11-stable/gnu/lib /freebsd-11-stable/gnu/usr.bin/binutils /freebsd-11-stable/gnu/usr.bin/cc/cc_tools /freebsd-11-stable/gnu/usr.bin/gdb /freebsd-11-stable/lib/libc/locale/ascii.c /freebsd-11-stable/sys/cddl/contrib/opensolaris /freebsd-11-stable/sys/contrib/dev/acpica /freebsd-11-stable/sys/contrib/ipfilter /freebsd-11-stable/sys/contrib/libfdt /freebsd-11-stable/sys/contrib/octeon-sdk /freebsd-11-stable/sys/contrib/x86emu /freebsd-11-stable/sys/contrib/xz-embedded /freebsd-11-stable/usr.sbin/bhyve/atkbdc.h /freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c /freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h /freebsd-11-stable/usr.sbin/bhyve/console.c /freebsd-11-stable/usr.sbin/bhyve/console.h /freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c /freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c /freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h /freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c /freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h /freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c /freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h /freebsd-11-stable/usr.sbin/bhyve/rfb.c /freebsd-11-stable/usr.sbin/bhyve/rfb.h /freebsd-11-stable/usr.sbin/bhyve/sockstream.c /freebsd-11-stable/usr.sbin/bhyve/sockstream.h /freebsd-11-stable/usr.sbin/bhyve/usb_emul.c /freebsd-11-stable/usr.sbin/bhyve/usb_emul.h /freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c /freebsd-11-stable/usr.sbin/bhyve/vga.c /freebsd-11-stable/usr.sbin/bhyve/vga.h
# 302100	22-Jun-2016	adrian	[ath] fix comments! I keep asking myself "what do these fields mean" and so now I've clarified it for myself. Tested: * Reading the comments, going "a-ha!" a couple times. Approved by: re (gjb)
# 301181	01-Jun-2016	adrian	[ath] commit initial bluetooth coexistence support for the MCI NICs. This is the initial framework to call into the MCI HAL routines and drive the basic state engine. The MCI bluetooth coex model uses a command channel between wlan and bluetooth, rather than a 2-wire or 3-wire signaling protocol to control things. This means the wlan and bluetooth chip exchange a lot more information and signaling, even at the per-packet level. The NICs in question can share the input LNA and output PA on the die, so they absolutely can't stomp on each other in a silly fashion. It also allows for the bluetooth side to signal when profiles come and go, so the driver can take appropriate control. There's also the possibility of dynamic bluetooth/wlan duty cycle control which I haven't yet really played with. It configures things up with a static "wlan wins everything" coexistence, configures up the available 2GHz channel map for bluetooth, sets a static duty cycle for bluetooth/wifi traffic priority and drives the basics needed to keep the MCI HAL code happy. It doesn't do any actual coexistence except to default to "wlan wins everything", which at least demonstrates that things do indeed work. Bluetooth inquiry frames still trump wifi (including beacons), so that demonstrates things really do indeed seem to work. Tested: * AR9462 (WB222), STA mode + bt * QCA9565 (WB335), STA mode + bt TODO: * .. the rest of coexistence. yes, bluetooth, not people. That stuff's hard. * It doesn't do the initial BT side calibration, which requires a WLAN chip reset. I'll fix up the reset path a bit more first before I enable that. * The 1-ant and 2-ant configuration bits aren't being set correctly in if_ath_btcoex.c - I'll dig into that and fix it in a subsequent commit. * It's not enabled by default for WB222/WB225 even though I believe it now can be - I'll chase that up in a subsequent commit. Obtained from: Qualcomm Atheros, Linux ath9k
# 298939	02-May-2016	pfg	dev/ath: minor spelling fixes in comments. No functional change. Reviewed by: adrian
# 298608	25-Apr-2016	adrian	[ath] add LDPC capability support and LDPC RX support. This enables LDPC receive support for the AR9300 chips that support it. It'll announce LDPC support via net80211. Tested: * AR9380, STA mode * AR9331, (to verify the HAL didn't attach it to a chip which doesn't support LDPC.) TODO: * Add in net80211 machinery to make this configurable at runtime.
# 290612	09-Nov-2015	adrian	ath(4): begin fleshing out a "reset type" extension to force cold/warn resets. Right now the only way to force a cold reset is: * The HAL itself detects it's needed, or * The sysctl, setting all resets to be cold. Trouble is, cold resets take quite a bit longer than warm resets. However, there are situations where a cold reset would be nice. Specifically, after a stuck beacon, BB/MAC hang, stuck calibration results, etc. The vendor HAL has a separate method to set the reset reason (which is how HAL_RESET_BBPANIC gets set) which informs the HAL during the reset path why it occured. This is almost but not quite the same; I may eventually unify both approaches in the future. This commit just extends HAL_RESET_TYPE to include both status (eg BBPANIC) and type (eg do COLD.) None of the HAL code uses it yet though; that'll come later. It also is a big no-op in each HAL - I need to go teach each of the HALs about cold/warm reset through this path.
# 290474	06-Nov-2015	adrian	ath(4) - reflect whether this is a full or fast channel change. It's no longer "outdoor."
# 288349	29-Sep-2015	adrian	Remove the references to the TX IC lock - i ended up solving this using net80211 to seralise encap+xmit, so now it's a non-issue.
# 288095	22-Sep-2015	adrian	net80211: include one copy of struct ieee80211_beacon_offsets into ieee80211vap Submitted by: Andriy Voskoboinyk <s3erios@gmail.com> Differential Revision: https://reviews.freebsd.org/D3658
# 287197	27-Aug-2015	glebius	Replay r286410. Change KPI of how device drivers that provide wireless connectivity interact with the net80211 stack. Historical background: originally wireless devices created an interface, just like Ethernet devices do. Name of an interface matched the name of the driver that created. Later, wlan(4) layer was introduced, and the wlanX interfaces become the actual interface, leaving original ones as "a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer and a driver became a mix of methods that pass a pointer to struct ifnet as identifier and methods that pass pointer to struct ieee80211com. From user point of view, the parent interface just hangs on in the ifconfig list, and user can't do anything useful with it. Now, the struct ifnet goes away. The struct ieee80211com is the only KPI between a device driver and net80211. Details: - The struct ieee80211com is embedded into drivers softc. - Packets are sent via new ic_transmit method, which is very much like the previous if_transmit. - Bringing parent up/down is done via new ic_parent method, which notifies driver about any changes: number of wlan(4) interfaces, number of them in promisc or allmulti state. - Device specific ioctls (if any) are received on new ic_ioctl method. - Packets/errors accounting are done by the stack. In certain cases, when driver experiences errors and can not attribute them to any specific interface, driver updates ic_oerrors or ic_ierrors counters. Details on interface configuration with new world order: - A sequence of commands needed to bring up wireless DOESN"T change. - /etc/rc.conf parameters DON'T change. - List of devices that can be used to create wlan(4) interfaces is now provided by net.wlan.devices sysctl. Most drivers in this change were converted by me, except of wpi(4), that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing changes to at least 8 drivers. Thanks to pluknet@, Oliver Hartmann, Olivier Cochard, gjb@, mmoll@, op@ and lev@, who also participated in testing. Reviewed by: adrian Sponsored by: Netflix Sponsored by: Nginx, Inc.
# 286437	07-Aug-2015	adrian	Revert the wifi ifnet changes until things are more baked and tested. * 286410 * 286413 * 286416 The initial commit broke a variety of debug and features that aren't in the GENERIC kernels but are enabled in other platforms.
# 286410	07-Aug-2015	glebius	Change KPI of how device drivers that provide wireless connectivity interact with the net80211 stack. Historical background: originally wireless devices created an interface, just like Ethernet devices do. Name of an interface matched the name of the driver that created. Later, wlan(4) layer was introduced, and the wlanX interfaces become the actual interface, leaving original ones as "a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer and a driver became a mix of methods that pass a pointer to struct ifnet as identifier and methods that pass pointer to struct ieee80211com. From user point of view, the parent interface just hangs on in the ifconfig list, and user can't do anything useful with it. Now, the struct ifnet goes away. The struct ieee80211com is the only KPI between a device driver and net80211. Details: - The struct ieee80211com is embedded into drivers softc. - Packets are sent via new ic_transmit method, which is very much like the previous if_transmit. - Bringing parent up/down is done via new ic_parent method, which notifies driver about any changes: number of wlan(4) interfaces, number of them in promisc or allmulti state. - Device specific ioctls (if any) are received on new ic_ioctl method. - Packets/errors accounting are done by the stack. In certain cases, when driver experiences errors and can not attribute them to any specific interface, driver updates ic_oerrors or ic_ierrors counters. Details on interface configuration with new world order: - A sequence of commands needed to bring up wireless DOESN"T change. - /etc/rc.conf parameters DON'T change. - List of devices that can be used to create wlan(4) interfaces is now provided by net.wlan.devices sysctl. Most drivers in this change were converted by me, except of wpi(4), that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing changes to at least 8 drivers. Thanks to Olivier Cochard, gjb@, mmoll@, op@ and lev@, who also participated in testing. Details here: https://wiki.freebsd.org/projects/ifnet/net80211 Still, drivers: ndis, wtap, mwl, ipw, bwn, wi, upgt, uath were not tested. Changes to mwl, ipw, bwn, wi, upgt are trivial and chances of problems are low. The wtap wasn't compilable even before this change. But the ndis driver is complex, and it is likely to be broken with this commit. Help with testing and debugging it is appreciated. Differential Revision: D2655, D2740 Sponsored by: Nginx, Inc. Sponsored by: Netflix
# 283535	25-May-2015	adrian	Begin plumbing ieee80211_rx_stats through the receive path. Smart NICs with firmware (eg wpi, iwn, the new atheros parts, the intel 7260 series, etc) support doing a lot of things in firmware. This includes but isn't limited to things like scanning, sending probe requests and receiving probe responses. However, net80211 doesn't know about any of this - it still drives the whole scan/probe infrastructure itself. In order to move towards suppoting smart NICs, the receive path needs to know about the channel/details for each received packet. In at least the iwn and 7260 firmware (and I believe wpi, but I haven't tried it yet) it will do the scanning, power-save and off-channel buffering for you - all you need to do is handle receiving beacons and probe responses on channels that aren't what you're currently on. However the whole receive path is peppered with ic->ic_curchan and manual scan/powersave handling. The beacon parsing code also checks ic->ic_curchan to determine if the received beacon is on the correct channel or not.[1] So: * add freq/ieee values to ieee80211_rx_stats; * change ieee80211_parse_beacon() to accept the 'current' channel as an argument; * modify the iv_input() and iv_recv_mgmt() methods to include the rx_stats; * add a new method - ieee80211_lookup_channel_rxstats() - that looks up a channel based on the contents of ieee80211_rx_stats; * if it exists, use it in the mgmt path to switch the current channel (which still defaults to ic->ic_curchan) over to something determined by rx_stats. This is enough to kick-start scan offload support in the Intel 7260 driver that Rui/I are working on. It also is a good start for scan offload support for a handful of existing NICs (wpi, iwn, some USB parts) and it'll very likely dramatically improve stability/performance there. It's not the whole thing - notably, we don't need to do powersave, we should not scan all channels, and we should leave probe request sending to the firmware and not do it ourselves. But, this allows for continued development on the above features whilst actually having a somewhat working NIC. TODO: * Finish tidying up how the net80211 input path works. Right now ieee80211_input / ieee80211_input_all act as the top-level that everything feeds into; it should change so the MIMO input routines are those and the legacy routines are phased out. * The band selection should be done by the driver, not by the net80211 layer. * ieee80211_lookup_channel_rxstats() only determines 11b or 11g channels for now - this is enough for scanning, but not 100% true in all cases. If we ever need to handle off-channel scan support for things like static-40MHz or static-80MHz, or turbo-G, or half/quarter rates, then we should extend this. [1] This is a side effect of frequency-hopping and CCK modes - you can receive beacons when you think you're on a different channel. In particular, CCK (which is used by the low 11b rates, eg beacons!) is decodable from adjacent channels - just at a low SNR. FH is a side effect of having the hardware/firmware do the frequency hopping - it may pick up beacons transmitted from other FH networks that are in a different phase of hopping frequencies.
# 272292	30-Sep-2014	adrian	Add initial support for the AR9485 CUS198 / CUS230 variants. These variants have a few differences from the default AR9485 NIC, namely: * a non-default antenna switch config; * slightly different RX gain table setup; * an external XLNA hooked up to a GPIO pin; * (and not yet done) RSSI threshold differences when doing slow diversity. To make this possible: * Add the PCI device list from Linux ath9k, complete with vendor and sub-vendor IDs for various things to be enabled; * .. and until FreeBSD learns about a PCI device list like this, write a search function inspired by the USB device enumeration code; * add HAL_OPS_CONFIG to the HAL attach methods; the HAL can use this to initialise its local driver parameters upon attach; * copy these parameters over in the AR9300 HAL; * don't default to override the antenna switch - only do it for the chips that require it; * I brought over ar9300_attenuation_apply() from ath9k which is cleaner and easier to read for this particular NIC. This is a work in progress. I'm worried that there's some post-AR9380 NIC out there which doesn't work without the antenna override set as I currently haven't implemented bluetooth coexistence for the AR9380 and later HAL. But I'd rather have this code in the tree and fix it up before 11.0-RELEASE happens versus having a set of newer NICs in laptops be effectively RX deaf. Tested: * AR9380 (STA) * AR9485 CUS198 (STA) Obtained from: Qualcomm Atheros, Linux ath9k
# 271887	19-Sep-2014	adrian	Fix up the EDMA RX setup path to correctly initialise and reset the RX FIFO. The original code was .. well, slightly more than incorrect. It showed up as stalled RX queues if the NIC needed to be frequently reinitialised (eg during scans.) This is inspired by work done by Matt Dillon over at the DragonflyBSD project. So: * track when EDMA RX has been stopped and when the MAC has been reset; * re-initialise the ring only after a reset; * track whether RX has been stopped/started - just for debugging now; * don't bother with the RX EOL stuff for EDMA - we don't need the interrupt at all. We also don't need to disable/enable the interrupt or start DMA - once new frames are pushed into the ring via the normal RX path, it'll just restart RX DMA on its own. Tested: * AR9380, STA mode * AR9380, AP mode * AR9485, STA mode * AR9462, STA mode
# 265409	05-May-2014	adrian	Modify the RX path to keep the previous RX descriptor around once it's used. It turns out that the RX DMA engine does the same last-descriptor-link- pointer-re-reading trick that the TX DMA engine. That is, the hardware re-reads the link pointer before it moves onto the next descriptor. Thus we can't free a descriptor before we move on; it's possible the hardware will need to re-read the link pointer before we overwrite it with a new one. Tested: * AR5416, STA mode TODO: * more thorough AP and STA mode testing! * test on other pre-AR9380 NICs, just to be sure. * Break out the RX descriptor grabbing bits from the RX completion bits, like what is done in the RX EDMA code, so .. * .. the RX lock can be held during ath_rx_proc(), but not across packet input.
# 265205	01-May-2014	adrian	Add tracking for self-generated frames when the VAP is in sleep state. The hardware can generate its own frames (eg RTS/CTS exchanges, other kinds of 802.11 management stuff, especially when it comes to 802.11n) and these also have PWRMGT flags. So if the VAP is asleep but the NIC is in force-awake for some reason, ensure that the self-generated frames have PWRMGT set to 1. Now, this (like basically everything to do with powersave) is still racy - the only way to guarantee that it's all actually consistent is to pause transmit and let it finish before transitioning the VAP to sleep, but this at least gets the basic method of tracking and updating the state debugged. Tested: * AR5416, STA mode * AR9380, STA mode
# 265115	30-Apr-2014	adrian	Bring over some initial power save management support, reset path fixes and beacon programming / debugging into the ath(4) driver. The basic power save tracking: * Add some new code to track the current desired powersave state; and * Add some reference count tracking so we know when the NIC is awake; then * Add code in all the points where we're about to touch the hardware and push it to force-wake. Then, how things are moved into power save: * Only move into network-sleep during a RUN->SLEEP transition; * Force wake the hardware up everywhere that we're about to touch the hardware. The net80211 stack takes care of doing RUN<->SLEEP<->(other) state transitions so we don't have to do it in the driver. Next, when to wake things up: * In short - everywhere we touch the hardware. * The hardware will take care of staying awake if things are queued in the transmit queue(s); it'll then transit down to sleep if there's nothing left. This way we don't have to track the software / hardware transmit queue(s) and keep the hardware awake for those. Then, some transmit path fixes that aren't related but useful: * Force EAPOL frames to go out at the lowest rate. This improves reliability during the encryption handshake after 802.11 negotiation. Next, some reset path fixes! * Fix the overlap between reset and transmit pause so we don't transmit frames during a reset. * Some noisy environments will end up taking a lot longer to reset than normal, so extend the reset period and drop the raise the reset interval to be more realistic and give the hardware some time to finish calibration. * Skip calibration during the reset path. Tsk! Then, beacon fixes in station mode! * Add a _lot_ more debugging in the station beacon reset path. This is all quite fluid right now. * Modify the STA beacon programming code to try and take the TU gap between desired TSF and the target TU into account. (Lifted from QCA.) Tested: * AR5210 * AR5211 * AR5212 * AR5413 * AR5416 * AR9280 * AR9285 TODO: * More AP, IBSS, mesh, TDMA testing * Thorough AR9380 and later testing! * AR9160 and AR9287 testing Obtained from: QCA
# 251655	12-Jun-2013	adrian	Migrate the LNA mixing diversity machinery from the AR9285 HAL to the driver. The AR9485 chip and AR933x SoC both implement LNA diversity. There are a few extra things that need to happen before this can be flipped on for those chips (mostly to do with setting up the different bias values and LNA1/LNA2 RSSI differences) but the first stage is putting this code into the driver layer so it can be reused. This has the added benefit of making it easier to expose configuration options and diagnostic information via the ioctl API. That's not yet being done but it sure would be nice to do so. Tested: * AR9285, with LNA diversity enabled * AR9285, with LNA diversity disabled in EEPROM
# 251484	07-Jun-2013	adrian	Add accessor macros for the bluetooth coexistence routines.
# 251401	04-Jun-2013	adrian	Implement a bit of a hack to store the AR9285/AR9485 RX LNA configuration in the RX antenna field. The AR9285/AR9485 use an LNA mixer to determine how to combine the signals from the two antennas. This is encoded in the RSSI fields (ctl/ext) for chain 2. So, let's use that here. This maps RX antennas 0->3 to the RX mixer configuration used to receive a frame. There's more that can be done but this is good enough to diagnose if the hardware is doing "odd" things like trying to receive frames on LNA2 (ie, antenna 2 or "alt" antenna) when there's only one antenna connected. Tested: * AR9285, STA mode
# 251014	26-May-2013	adrian	Migrate ath(4) to now use if_transmit instead of the legacy if_start and if queue mechanism; also fix up (non-11n) TX fragment handling. This may result in a bit of a performance drop for now but I plan on debugging and resolving this at a later stage. Whilst here, fix the transmit path so fragment transmission works. The TX fragmentation handling is a bit more special. In order to correctly transmit TX fragments, there's a bunch of corner cases that need to be handled: * They must be transmitted back to back, in the same order.. * .. ie, you need to hold the TX lock whilst transmitting this set of fragments rather than interleaving it with other MSDUs destined to other nodes; * The length of the next fragment is required when transmitting, in order to correctly set the NAV field in the current frame to the length of the next frame; which requires .. * .. that we know the transmit duration of the next frame, which .. * .. requires us to set the rate of all fragments to the same length, or make the decision up-front, etc. To facilitate this, I've added a new ath_buf field to describe the length of the next fragment. This avoids having to keep the mbuf chain together. This used to work before my 11n TX path work because the ath_tx_start() routine would be handed a single mbuf with m_nextpkt pointing to the next frame, and that would be maintained all the way up to when the duration calculation was done. This doesn't hold true any longer - the actual queuing may occur at any point in the future (think ath_node TID software queuing) so this information needs to be maintained. Right now this does work for non-11n frames but it doesn't at all enforce the same rate control decision for all frames in the fragment. I plan on fixing this in a followup commit. RTS/CTS has the same issue, I'll look at fixing this in a subsequent commit. Finaly, 11n fragment support requires the driver to have fully decided what the rate scenario setup is - including 20/40MHz, short/long GI, STBC, LDPC, number of streams, etc. Right now that decision is (currently) made _after_ the NAV field value is updated. I'll fix all of this in subsequent commits. Tested: * AR5416, STA, transmitting 11abg fragments * AR5416, STA, 11n fragments work but the NAV field is incorrect for the reasons above. TODO: * It would be nice to be able to queue mbufs per-node and per-TID so we can only queue ath_buf entries when it's time to assemble frames to send to the hardware. But honestly, we should just do that level of software queue management in net80211 rather than ath(4), so I'm going to leave this alone for now. * More thorough AP, mesh and adhoc testing. * Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has been negotiated, as we can't do software retransmission of fragments. * .. set CLRDMASK when transmitting fragments, just to ensure.
# 250866	21-May-2013	adrian	Implement a separate hardware queue threshold for aggregate and non-aggr traffic. When transmitting non-aggregate traffic, we need to keep the hardware busy whilst transmitting or small bursts in txdone/tx latency will kill us. This restores non-aggregate iperf performance, especially when doing TDMA. Tested: * AR5416<->AR5416, TDMA * AR5416 STA <-> AR9280 AP
# 250865	21-May-2013	adrian	Enable the use of TDMA on an 802.11n channel (with aggregation disabled, of course.) There's a few things that needed to happen: * In case someone decides to set the beacon transmission rate to be at an MCS rate, use the MCS-aware version of the duration calculation to figure out how long the received beacon frame was. * If TxOP enforcing is available on the hardware and we're doing TDMA, enable it after a reset and set the TDMA guard interval to zero. This seems to behave fine. TODO: * Although I haven't yet seen packet loss, the PHY errors that would be triggered (specifically Transmit-Override-Receive) aren't enabled by the 11n HAL. I'll have to do some work to enable these PHY errors for debugging. What broke: * My recent changes to the TX queue handling has resulted in the driver not keeping the hardware queue properly filled when doing non-aggregate traffic. I have a patch to commit soon which fixes this situation (albeit by reminding me about how my ath driver locking isn't working out, sigh.) So if you want to test this without updating to the next set of patches that I commit, just bump the sysctl dev.ath.X.hwq_limit from 2 to 32. Tested: * AR5416 <-> AR5416, with ampdu disabled, HT40, 5GHz, MCS12+Short-GI. I saw 30mbit/sec in both directions using a bidirectional UDP test.
# 250783	18-May-2013	adrian	Be (very) careful about how to add more TX DMA work. The list-based DMA engine has the following behaviour: * When the DMA engine is in the init state, you can write the first descriptor address to the QCU TxDP register and it will work. * Then when it hits the end of the list (ie, it either hits a NULL link pointer, OR it hits a descriptor with VEOL set) the QCU stops, and the TxDP points to the last descriptor that was transmitted. * Then when you want to transmit a new frame, you can then either: + write the head of the new list into TxDP, or + you write the head of the new list into the link pointer of the last completed descriptor (ie, where TxDP points), then kick TxE to restart transmission on that QCU> * The hardware then will re-read the descriptor to pick up the link pointer and then jump to that. Now, the quirks: * If you write a TxDP when there's been no previous TxDP (ie, it's 0), it works. * If you write a TxDP in any other instance, the TxDP write may actually fail. Thus, when you start transmission, it will re-read the last transmitted descriptor to get the link pointer, NOT just start a new transmission. So the correct thing to do here is: * ALWAYS use the holding descriptor (ie, the last transmitted descriptor that we've kept safe) and use the link pointer in _THAT_ to transmit the next frame. * NEVER write to the TxDP after you've done the initial write. * .. also, don't do this whilst you're also resetting the NIC. With this in mind, the following patch does basically the above. * Since this encapsulates Sam's issues with the QCU behaviour w/ TDMA, kill the TDMA special case and replace it with the above. * Add a new TXQ flag - PUTRUNNING - which indicates that we've started DMA. * Clear that flag when DMA has been shutdown. * Ensure that we're not restarting DMA with PUTRUNNING enabled. * Fix the link pointer logic during TXQ drain - we should always ensure the link pointer does point to something if there's a list of frames. Having it be NULL as an indication that DMA has finished or during a reset causes trouble. Now, given all of this, i want to nuke axq_link from orbit. There's now HAL methods to get and set the link pointer of a descriptor, so what we should do instead is to update the right link pointer. * If there's a holding descriptor and an empty TXQ list, set the link pointer of said holding descriptor to the new frame. * If there's a non-empty TXQ list, set the link pointer of the last descriptor in the list to the new frame. * Nuke axq_link from orbit. Note: * The AR9380 doesn't need this. FIFO TX writes are atomic. As long as we don't append to a list of frames that we've already passed to the hardware, all of the above doesn't apply. The holding descriptor stuff is still needed to ensure the hardware can re-read a completed descriptor to move onto the next one, but we restart DMA by pushing in a new FIFO entry into the TX QCU. That doesn't require any real gymnastics. Tested: * AR5210, AR5211, AR5212, AR5416, AR9380 - STA mode.
# 250665	15-May-2013	adrian	Implement my first cut at "correct" node power-save and PS-POLL support. This implements PS-POLL awareness i nthe * Implement frame "leaking", which allows for a software queue to be scheduled even though it's asleep * Track whether a frame has been leaked or not * Leak out a single non-AMPDU frame when transmitting aggregates * Queue BAR frames if the node is asleep * Direct-dispatch the rest of control and management frames. This allows for things like re-association to occur (which involves sending probe req/resp as well as assoc request/response) when the node is asleep and then tries reassociating. * Limit how many frames can set in the software node queue whilst the node is asleep. net80211 is already buffering frames for us so this is mostly just paranoia. * Add a PS-POLL method which leaks out a frame if there's something in the software queue, else it calls net80211's ps-poll routine. Since the ath PS-POLL routine marks the node as having a single frame to leak, either a software queued frame would leak, OR the next queued frame would leak. The next queued frame could be something from the net80211 power save queue, OR it could be a NULL frame from net80211. TODO: * Don't transmit further BAR frames (eg via a timeout) if the node is currently asleep. Otherwise we may end up exhausting management frames due to the lots of queued BAR frames. I may just undo this bit later on and direct-dispatch BAR frames even if the node is asleep. * It would be nice to burst out a single A-MPDU frame if both ends support this. I may end adding a FreeBSD IE soon to negotiate this power save behaviour. * I should make STAs timeout of power save mode if they've been in power save for more than a handful of seconds. This way cards that get "stuck" in power save mode don't stay there for the "inactivity" timeout in net80211. * Move the queue depth check into the driver layer (ath_start / ath_transmit) rather than doing it in the TX path. * There could be some naughty corner cases with ps-poll leaking. Specifically, if net80211 generates a NULL data frame whilst another transmitter sends a normal data frame out net80211 output / transmit, we need to ensure that the NULL data frame goes out first. This is one of those things that should occur inside the VAP/ic TX lock. Grr, more investigations to do.. Tested: * STA: AR5416, AR9280 * AP: AR5416, AR9280, AR9160
# 250609	13-May-2013	adrian	Since the node state is 100% back under the TX lock, just kill the use of atomics. I'll re-think this nonsense later.
# 250607	13-May-2013	adrian	This lock only protects the rate control state for now, mention this.
# 250391	08-May-2013	adrian	Fix the holding descriptor logic to actually be "right" (for values of "right".) Flip back on the "always continue TX DMA using the holding descriptor" code - by always setting ATH_BUF_BUSY and never setting axq_link to NULL. Since the holding descriptor is accessed via txq->axq_link and _that_ is done behind the TXQ lock rather than the TX path lock, the holding descriptor stuff itself needs to be behind the TXQ lock. So, do the mental gymnastics needed to do this. I've not seen any of the hardware failures that I was seeing when I last tried to do this. Tested: * AR5416, STA mode
# 250326	07-May-2013	adrian	Re-work how transmit buffer limits are enforced - partly to fix the PR, but partly to just tidy up things. The problem here - there are too many TX buffers in the queue! By the time one needs to transmit an EAPOL frame (for this PR, it's the response to the group rekey notification from the AP) there are no ath_buf entries free and the EAPOL frame doesn't go out. Now, the problem! * Enforcing the TX buffer limitation _before_ we dequeue the frame? Bad idea. Because.. * .. it means I can't check whether the mbuf has M_EAPOL set. The solution(s): * De-queue the frame first * Don't bother doing the TX buffer minimum free check until after we know whether it's an EAPOL frame or not. * If it's an EAPOL frame, allocate the buffer from the mgmt pool rather than the default pool. Whilst I'm here: * Add a tweak to limit how many buffers a single node can acquire. * Don't enforce that for EAPOL frames. * .. set that to default to 1/4 of the available buffers, or 32, whichever is more sane. This doesn't fix issues due to a sleeping node or a very poor performing node; but this doesn't make it worse. Tested: * AR5416 STA, TX'ing 100+ mbit UDP to an AP, but only 50mbit being received (thus the TX queue fills up.) * .. with CCMP / WPA2 encryption configured * .. and the group rekey time set to 10 seconds, just to elicit the behaviour very quickly. PR: kern/138379
# 249639	19-Apr-2013	adrian	Use uint32_t for fields that are fetched via ath_hal_getcapability().
# 249565	16-Apr-2013	adrian	Use a per-RX-queue deferred list, rather than a single deferred list for both queues. Since ath_rx_pkt() does multi-mbuf frame recombining based on the RX queue, this needs to occur. Tested: * AR9380 (XB112), hostap mode
# 248745	26-Mar-2013	adrian	Add per-TXQ EDMA FIFO staging queue support. Each set of frames pushed into a FIFO is represented by a list of ath_bufs - the first ath_buf in the FIFO list is marked with ATH_BUF_FIFOPTR; the last ath_buf in the FIFO list is marked with ATH_BUF_FIFOEND. Multiple lists of frames are just glued together in the TAILQ as per normal - except that at the end of a FIFO list, the descriptor link pointer will be NULL and it'll be tagged with ATH_BUF_FIFOEND. For non-EDMA chipsets this is a no-op - the ath_txq frame list (axq_q) stays the same and is treated the same. For EDMA chipsets the frames are pushed into axq_q and then when the FIFO is to be (re) filled, frames will be moved onto the FIFO queue and then pushed into the FIFO. So: * Add a new queue in each hardware TXQ (ath_txq) for staging FIFO frame lists. It's a TAILQ (like the normal hardware frame queue) rather than the ath9k list-of-lists to represent FIFO entries. * Add new ath_buf flags - ATH_TX_FIFOPTR and ATH_TX_FIFOEND. * When allocating ath_buf entries, clear out the flag value before returning it or it'll end up having stale flags. * When cloning ath_buf entries, only clone ATH_BUF_MGMT. Don't clone the FIFO related flags. * Extend ath_tx_draintxq() to first drain the FIFO staging queue, _then_ drain the normal hardware queue. Tested: * AR9280, hostap * AR9280, STA * AR9380/AR9580 - hostap TODO: * Test on other chipsets, just to be thorough.
# 248671	23-Mar-2013	adrian	Overhaul the TXQ locking (again!) as part of some beacon/cabq timing related issues. Moving the TX locking under one lock made things easier to progress on but it had one important side-effect - it increased the latency when handling CABQ setup when sending beacons. This commit introduces a bunch of new changes and a few unrelated changs that are just easier to lump in here. The aim is to have the CABQ locking separate from other locking. The CABQ transmit path in the beacon process thus doesn't have to grab the general TX lock, reducing lock contention/latency and making it more likely that we'll make the beacon TX timing. The second half of this commit is the CABQ related setup changes needed for sane looking EDMA CABQ support. Right now the EDMA TX code naively assumes that only one frame (MPDU or A-MPDU) is being pushed into each FIFO slot. For the CABQ this isn't true - a whole list of frames is being pushed in - and thus CABQ handling breaks very quickly. The aim here is to setup the CABQ list and then push _that list_ to the hardware for transmission. I can then extend the EDMA TX code to stamp that list as being "one" FIFO entry (likely by tagging the last buffer in that list as "FIFO END") so the EDMA TX completion code correctly tracks things. Major: * Migrate the per-TXQ add/removal locking back to per-TXQ, rather than a single lock. * Leave the software queue side of things under the ATH_TX_LOCK lock, (continuing) to serialise things as they are. * Add a new function which is called whenever there's a beacon miss, to print out some debugging. This is primarily designed to help me figure out if the beacon miss events are due to a noisy environment, issues with the PHY/MAC, or other. * Move the CABQ setup/enable to occur _after_ all the VAPs have been looked at. This means that for multiple VAPS in bursted mode, the CABQ gets primed once all VAPs are checked, rather than being primed on the first VAP and then having frames appended after this. Minor: * Add a (disabled) twiddle to let me enable/disable cabq traffic. It's primarily there to let me easily debug what's going on with beacon and CABQ setup/traffic; there's some DMA engine hangs which I'm finally trying to trace down. * Clear bf_next when flushing frames; it should quieten some warnings that show up when a node goes away. Tested: * AR9280, STA/hostap, up to 4 vaps (staggered) * AR5416, STA/hostap, up to 4 vaps (staggered) TODO: * (Lots) more AR9380 and later testing, as I may have missed something here. * Leverage this to fix CABQ hanling for AR9380 and later chips. * Force bursted beaconing on the chips that default to staggered beacons and ensure the CABQ stuff is all sane (eg, the MORE bits that aren't being correctly set when chaining descriptors.)
# 248529	19-Mar-2013	adrian	Break out the RX completion path into "FIFO check / refill" and "complete RX frames." The 128 entry RX FIFO is really easy to fill up and miss refilling when it's done in the ath taskq - as that gets blocked up doing RX completion, TX completion and other random things. So the 128 entry RX FIFO now gets emptied and refilled in the ath_intr() task (and it grabs / releases locks, so now ath_intr() can't just be a FAST handler yet!) but the locks aren't held for very long. The completion part is done in the ath taskqueue context. Details: * Create a new completed frame list - sc->sc_rx_rxlist; * Split the EDMA RX process queue into two halves - one that processes the RX FIFO and refills it with new frames; another that completes the completed frame list; * When tearing down the driver, flush whatever is in the deferred queue as well as what's in the FIFO; * Create two new RX methods - one that processes all RX queues, one that processes the given RX queue. When MSI is implemented, we get told which RX queue the interrupt came in on so we can specifically schedule that. (And I can do that with the non-MSI path too; I'll figure that out later.) * Convert the legacy code over to use these new RX methods; * Replace all the instances of the RX taskqueue enqueue with a call to a relevant RX method to enqueue one or all RX queues. Tested: * AR9380, STA * AR9580, STA * AR5413, STA
# 248311	15-Mar-2013	adrian	Add locking around the new holdingbf code. Since this is being done during buffer free, it's a crap shoot whether the TX path lock is held or not. I tried putting the ath_freebuf() code inside the TX lock and I got all kinds of locking issues - it turns out that the buffer free path sometimes is called with the lock held and sometimes isn't. So I'll go and fix that soon. Hence for now the holdingbf buffers are protected by the TXBUF lock.
# 248264	14-Mar-2013	adrian	Implement "holding buffers" per TX queue rather than globally. When working on TDMA, Sam Leffler found that the MAC DMA hardware would re-read the last TX descriptor when getting ready to transmit the next one. Thus the whole ATH_BUF_BUSY came into existance - the descriptor must be left alone (very specifically the link pointer must be maintained) until the hardware has moved onto the next frame. He saw this in TDMA because the MAC would be frequently stopping during active transmit (ie, when it wasn't its turn to transmit.) Fast-forward to today. It turns out that this is a problem not with a single MAC DMA instance, but with each QCU (from 0->9). They each maintain separate descriptor pointers and will re-read the last descriptor when starting to transmit the next. So when your AP is busy transmitting from multiple TX queues, you'll (more) frequently see one QCU stopped, waiting for a higher-priority QCU to finsh transmitting, before it'll go ahead and continue. If you mess up the descriptor (ie by freeing it) then you're short of luck. Thanks to rpaulo for sticking with me whilst I diagnosed this issue that he was quite reliably triggering in his environment. This is a reimplementation; it doesn't have anything in common with the ath9k or the Qualcomm Atheros reference driver. Now - it in theory doesn't apply on the EDMA chips, as long as you push one complete frame into the FIFO at a time. But the MAC can DMA from a list of frames pushed into the hardware queue (ie, you concat 'n' frames together with link pointers, and then push the head pointer into the TXQ FIFO.) Since that's likely how I'm going to implement CABQ handling in hostap mode, it's likely that I will end up teaching the EDMA TX completion code about busy buffers, just to be "sure" this doesn't creep up. Tested - iperf ap->sta and sta->ap (with both sides running this code): * AR5416 STA * AR9160/AR9220 hostap To validate that it doesn't break the EDMA (FIFO) chips: * AR9380, AR9485, AR9462 STA Using iperf with the -S <tos byte decimal value> to set the TCP client side DSCP bits, mapping to different TIDs and thus different TX queues. TODO: * Make this work on the EDMA chips, if we end up pushing lists of frames to the hardware (eg how we eventually will handle cabq in hostap/ibss mode.)
# 247774	04-Mar-2013	adrian	add a method to set/clear the VMF field in the TX descriptor. Obtained from: Qualcomm Atheros
# 247366	26-Feb-2013	adrian	Add in the STBC TX/RX capability support into the HAL and driver. The HAL already included the STBC fields; it just needed to be exposed to the driver and net80211 stack. This should allow single-stream STBC TX and RX to be negotiated; however the driver and rate control code currently don't do anything with it.
# 247286	25-Feb-2013	adrian	Begin adding support to explicitly set the current chainmask. Right now the only way to set the chainmask is to set the hardware configured chainmask through capabilities. This is fine for forcing the chainmask to be something other than what the hardware is capable of (eg to reduce TX/RX to one connected antenna) but it does change what the HAL hardware chainmask configuration is. For operational mode changes, it (may?) make sense to separately control the TX/RX chainmask. Right now it's done as part of ar5416_reset.c - ar5416UpdateChainMasks() calculates which TX/RX chainmasks to enable based on the operating mode. (1 for legacy and whatever is supported for 11n operation.) But doing this in the HAL is suboptimal - the driver needs to know the currently configured chainmask in order to correctly enable things for each TX descriptor. This is currently done by overriding the chainmask config in the ar5416 TX routines but this has to disappear - the AR9300 HAL support requires the driver to dynamically set the TX chainmask based on the TX power and TX rate in order to meet mini-PCIe slot power requirements. So: * Introduce a new HAL method to set the operational chainmask variables; * Introduce null methods for the previous generation chipsets; * Add new driver state to record the current chainmask separate from the hardware configured chainmask. Part #2 of this will involve disabling ar5416UpdateChainMasks() and moving it into the driver; as well as properly programming the TX chainmask based on the currently configured HAL chainmask. Tested: * AR5416, STA mode - both legacy (11a/11bg) and 11n rates - verified that AR_SELFGEN_MASK (the chainmask used for self-generated frames like ACKs and RTSes) is correct, as well as the TX descriptor contents is correct.
# 247087	21-Feb-2013	adrian	Add an option to allow the minimum number of delimiters to be tweaked. This is primarily for debugging purposes. Tested: * AR5416, STA mode
# 247085	21-Feb-2013	adrian	Add a new option to limit the maximum size of aggregates. The default is to limit them to what the hardware is capable of. Add sysctl twiddles for both the non-RTS and RTS protected aggregate generation. Whilst here, add some comments about stuff that I've discovered during my exploration of the TX aggregate / delimiter setup path from the reference driver.
# 246745	13-Feb-2013	adrian	Pull out the if_transmit() work and revert back to ath_start(). My changed had some rather significant behavioural changes to throughput. The two issues I noticed: * With if_start and the ifnet mbuf queue, any temporary latency would get eaten up by some mbufs being queued. With ath_transmit() queuing things to ath_buf's, I'd only get 512 TX buffers before I couldn't queue any further frames. * There's also some non-zero latency involved with TX being pushed into a taskqueue via direct dispatch. Any time the scheduler didn't immediately schedule the ath TX task would cause extra latency. Various 1ge/10ge drivers implement both direct dispatch (if the TX lock can be acquired) and deferred task transmission (if the TX lock can't be acquired), with frames being pushed into a drbd queue. I'll have to do this at some point, but until I figure out how to deal with 802.11 fragments, I'll have to wait a while longer. So what I saw: * lots of extra latency, specially under load - if the taskqueue wasn't immediately scheduled, things went pear shaped; * any extra latency would result in TX ath_buf's taking their sweet time being replenished, so any further calls to ath_transmit() would drop mbufs. * .. yes, there's no explicit backpressure here - things are just dropped. Eek. With this, the general performance has gone up, but those subtle if_start() related race conditions are back. For some reason, this is doubly-obvious with the AR5416 NIC and I don't quite understand why yet. There's an unrelated issue with AR5416 performance in STA mode (it's fine in AP mode when bridging frames, weirdly..) that requires a little further investigation. Specifically - it works fine on a Lenovo T40 (single core CPU) running a March 2012 9-STABLE kernel, but a Lenovo T60 (dual core) running an early November 2012 kernel behaves very poorly. The same hardware with an AR9160 or AR9280 behaves perfectly.
# 246453	07-Feb-2013	adrian	Create a new TX lock specifically for queuing frames. This now separates out the act of queuing frames from the act of running TX and TX completion.
# 245708	21-Jan-2013	adrian	Migrate CLRDMASK to be a per-node flag, rather than a per-TID flag. This is easily possible now that the TX is protected by a single lock, rather than a per-TXQ (and thus per-TID) lock. Only set CLRDMASK if none of the destinations are filtered. This likely will need some tuning when it comes time to do UASPD/PS-POLL TX, however at that point it should be manually set anyway. Tested: * AR9280, STA mode TODO: * More thorough testing in AP mode * test other chipsets, just to be safe/sure.
# 245465	15-Jan-2013	adrian	Implement frame (data) transmission using if_transmit(), rather than if_start(). This removes the overlapping data path TX from occuring, which solves quite a number of the potential TX queue races in ath(4). It doesn't fix the net80211 layer TX queue races and it doesn't fix the raw TX path yet, but it's an important step towards this. This hasn't dropped the TX performance in my testing; primarily because now the TX path can quickly queue frames and continue along processing. This involves a few rather deep changes: * Use the ath_buf as a queue placeholder for now, as we need to be able to support queuing a list of mbufs (ie, when transmitting fragments) and m_nextpkt can't be used here (because it's what is joining the fragments together) * if_transmit() now simply allocates the ath_buf and queues it to a driver TX staging queue. * TX is now moved into a taskqueue function. * The TX taskqueue function now dequeues and transmits frames. * Fragments are handled correctly here - as the current API passes the fragment list as one mbuf list (joined with m_nextpkt) through to the driver if_transmit(). * For the couple of places where ath_start() may be called (mostly from net80211 when starting the VAP up again), just reimplement it using the new enqueue and taskqueue methods. What I don't like (about this work and the TX code in general): * I'm using the same lock for the staging TX queue management and the actual TX. This isn't required; I'm just being slack. * I haven't yet moved TX to a separate taskqueue (but the taskqueue is created); it's easy enough to do this later if necessary. I just need to make sure it's a higher priority queue, so TX has the same behaviour as it used to (where it would preempt existing RX..) * I need to re-review the TX path a little more and make sure that ieee80211_node_() functions aren't called within the TX lock. When queueing, I should just push failed frames into a queue and when I'm wrapping up the TX code, unlock the TX lock and call ieee80211_node_free() on each. It would be nice if I could hold the TX lock for the entire TX and TX completion, rather than this release/re-acquire behaviour. But that requires that I shuffle around the TX completion code to handle actual ath_buf free and net80211 callback/free outside of the TX lock. That's one of my next projects. * the ic_raw_xmit() path doesn't use this yet - so it still has sequencing problems with parallel, overlapping calls to the data path. I'll fix this later. Tested: * Hostap - AR9280, AR9220 * STA - AR5212, AR9280, AR5416
# 245002	03-Jan-2013	adrian	Don't call the spectral methods for NICS that don't implement them.
# 244951	02-Jan-2013	adrian	Add a new (skeleton) spectral mode manager module.
# 244947	01-Jan-2013	adrian	Add spectral HAL accessor methods.
# 244109	11-Dec-2012	adrian	There's no need to use a TXQ pointer here; we specifically need the hardware queue ID when queuing to EDMA descriptors. This is a small part of trying to reduce the size of ath_buf entries.
# 243786	02-Dec-2012	adrian	Delete the per-TXQ locks and replace them with a single TX lock. I couldn't think of a way to maintain the hardware TXQ locks _and_ layer on top of that per-TXQ software queuing and any other kind of fine-grained locks (eg per-TID, or per-node locks.) So for now, to facilitate some further code refactoring and development as part of the final push to get software queue ps-poll and u-apsd handling into this driver, just do away with them entirely. I may eventually bring them back at some point, when it looks slightly more architectually cleaner to do so. But as it stands at the present, it's not really buying us much: * in order to properly serialise things and not get bitten by scheduling and locking interactions with things higher up in the stack, we need to wrap the whole TX path in a long held lock. Otherwise we can end up being pre-empted during frame handling, resulting in some out of order frame handling between sequence number allocation and encryption handling (ie, the seqno and the CCMP IV get out of sequence); * .. so whilst that's the case, holding the lock for that long means that we're acquiring and releasing the TXQ lock _inside_ that context; * And we also acquire it per-frame during frame completion, but we currently can't hold the lock for the duration of the TX completion as we need to call net80211 layer things with the locks _unheld_ to avoid LOR. * .. the other places were grab that lock are reset/flush, which don't happen often. My eventual aim is to change the TX path so all rejected frame transmissions and all frame completions result in any ieee80211_free_node() calls to occur outside of the TX lock; then I can cut back on the amount of locking that goes on here. There may be some LORs that occur when ieee80211_free_node() is called when the TX queue path fails; I'll begin to address these in follow-up commits.
# 243425	23-Nov-2012	adrian	Add the HAL wrapper for settsf64.
# 242853	10-Nov-2012	kevlo	Fix the build.
# 242782	08-Nov-2012	adrian	Add some hooks into the driver to attach, detach and record EDMA descriptor events. This is primarily for the TX EDMA and TX EDMA completion. I haven't yet tied it into the EDMA RX path or the legacy TX/RX path. Things that I don't quite like: * Make the pointer type 'void' in ath_softc and have if_ath_alq() return a malloc'ed buffer. That would remove the need to include if_ath_alq.h in if_athvar.h. The sysctl setup needs to be cleaned up.
# 242527	03-Nov-2012	adrian	Add a new HAL call to extract out the HAL enterprise bits from the AR9300 HAL.
# 242510	03-Nov-2012	adrian	HAL API updates, from the previous couple of HAL commits.
# 242391	31-Oct-2012	adrian	I give up - introduce a TX lock to serialise TX operations. I've tried serialising TX using queues and such but unfortunately due to how this interacts with the locking going on elsewhere in the networking stack, the TX task gets delayed, resulting in quite a noticable throughput loss: * baseline TCP for 2x2 11n HT40 is ~ 170mbit/sec; * TCP for TX task in the ath taskq, with the RX also going on - 80mbit/sec; * TCP for TX task in a separate, second taskq - 100mbit/sec. So for now I'm going with the Linux wireless stack approach - lock tx early. The linux code does in the wireless stack, before the 802.11 state stuff happens and before it's punted to the driver. But TX locking needs to also occur at the driver layer as the TX completion code _also_ begins to drain the ifnet TX queue. Whilst I'm here, add some KTR traces for the TX path. Note: * This really should be done at the net80211 layer (as well, at least.) But that'll have to wait for a little more thought to happen.
# 242271	28-Oct-2012	adrian	Begin fleshing out some software queue awareness for TIM handling with the power save queue. * introduce some new ATH_NODE lock protected fields, tracking the net80211 psq and TIM state; * when doing buffer transitions - ie, when sending and completing buffers - check the state of the SWQ and update the TIM appropriately. * when clearing the TIM bit, if the SWQ is not empty then delay clearing it. This is racy, but it's no less racy than the current net80211 power save queue management code. Specifically, with multiple TX threads, it's quite plausible that parallel state updates will race and the TIM will be left in an inconsistent state. I'll address that in a follow-up commit.
# 241567	14-Oct-2012	adrian	Track the total number of software queued frames in an atomic variable stashed away in ath_node. As much as I tried to stuff that behind the ATH_NODE lock, unfortunately the locking is just too plain hairy (for me! And I wrote it!) to do cleanly. Hence using atomics here instead of a lock. The ATH_NODE lock just isn't currently used anywhere besides the rate control updates. If in the future everything gets migrated back to using a single ATH_NODE lock or a single global ATH_TX lock (ie, a single TX lock for all TX and TX completion) then fine, I'll remove the atomics.
# 241566	14-Oct-2012	adrian	Stop abusing the ATH_TID_*() queue macros for filtered frames and give them their own macro set.
# 241559	14-Oct-2012	adrian	Push the actual TX processing into the ath taskqueue, rather than having it run out of multiple concurrent contexts. Right now the ath(4) TX processing is a bit hairy. Specifically: * It was running out of ath_start(), which could occur from multiple concurrent sending processes (as if_start() can be started from multiple sending threads nowdays.. sigh) * during RX if fast frames are enabled (so not really at the moment, not until I fix this particular feature again..) * during ath_reset() - so anything which calls that * during ath_tx_proc() in the ath taskqueue - ie, TX is attempted again after TX completion, as there's now hopefully some ath_bufs available. Then, the ic_raw_xmit() method can queue raw frames for transmission at any time, from any net80211 TX context. Ew. This has caused packet ordering issues in the past - specifically, there's absolutely no guarantee that preemption won't occuring _during_ ath_start() by the TX completion processing, which will call ath_start() again. It's a mess - 802.11 really, really wants things to be in sequence or things go all kinds of loopy. So: * create a new task struct for TX'ing; * make the if_start method simply queue the task on the ath taskqueue; * make ath_start() just be called by the new TX task; * make ath_tx_kick() just schedule the ath TX task, rather than directly calling ath_start(). Now yes, this means that I've taken a step backwards in terms of concurrency - TX -and- RX now occur in the same single-task taskqueue. But there's nothing stopping me from separating out the TX / TX completion code into a separate taskqueue which runs in parallel with the RX path, if that ends up being appropriate for some platforms. This fixes the CCMP/seqno concurrency issues that creep up when you transmit large amounts of uni-directional UDP traffic (>200MBit) on a FreeBSD STA -> AP, as now there's only one TX context no matter what's going on (TX completion->retry/software queue, userland->net80211->ath_start(), TX completion -> ath_start()); but it won't fix any concurrency issues between raw transmitted frames and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other TID 16 multicast traffic that gets put on the CABQ.) That is going to require a bunch more re-architecture before it's feasible to fix. In any case, this is a big step towards making the majority of the TX path locking irrelevant, as now almost all TX activity occurs in the taskqueue. Phew.
# 241336	07-Oct-2012	adrian	Migrate the TID TXQ accesses to a new set of macros, rather than reusing the ATH_TXQ_* macros. * Introduce the new macros; * rename the TID queue and TID filtered frame queue so the compiler tells me I'm using the wrong macro. These should correspond 1:1 to the existing code.
# 241170	03-Oct-2012	adrian	Pause and unpause the software queues for a given node based on the net80211 node power save state. * Add an ATH_NODE_UNLOCK_ASSERT() check * Add a new node field - an_is_powersave * Pause/unpause the queue based on the node state * Attempt to handle net80211 concurrency issues so the queue doesn't get paused/unpaused more than once at a time from the net80211 power save code. Whilst here (and breaking my usual rule), set CLRDMASK when a queue is unpaused, regardless of whether the queue has some pending traffic. This means the first frame from that TID (now or later) will hvae CLRDMASK set. Also whilst here, bump the swretrymax counters whenever the filtered frames code expires a frame. Again, breaking my rule, but this is just a statistics thing rather than a functional change. This doesn't fix ps-poll (but it doesn't break it too much worse than it is at the present) or correcting the TID updates. That's next on the list. Tested: * AR9220 AP (Atheros AP96 reference design) * Macbook Pro and LG Optimus 1 Android phone, both setting and clearing power save state (but not using PS-POLL.)
# 240899	24-Sep-2012	adrian	Migrate the ath(4) KTR logging to use an ATH_KTR() macro. This should eventually be unified with ATH_DEBUG() so I can get both from one macro; that may take some time. Add some new probes for TX and TX completion.
# 240585	16-Sep-2012	adrian	Add a per-TID filter queue and filter state bits. These are intended for software TX filtering support, where the NIC decides there has been too many successive failues to a destination and will filter it. Although the filtering is done per-destination (via the keycache), the state and queue is kept per-TID for now. It simplifies the overall architecture design and locking. Whilst here, add ATH_TID_UNLOCK_ASSERT().
# 239656	24-Aug-2012	adrian	Add an accessor macro for getting access to the default DFS parameters. PR: kern/170904
# 239282	15-Aug-2012	adrian	Implement a sequential descriptor ID value and stuff it in the ath_buf. This will be used by the EDMA TX code to assign descriptor IDs in order to provide some debugging.
# 239261	14-Aug-2012	adrian	Add an assertion to check that the given TXQ is _not_ locked.
# 239205	11-Aug-2012	adrian	Revert the ath_tx_draintxq() method, and instead teach it the minimum necessary to "do" EDMA. It was just using the TX completion status for logging information about the descriptor completion. Since with EDMA we don't know this without checking the TX completion FIFO, we can't provide this information. So don't.
# 239204	11-Aug-2012	adrian	Break out ath_draintxq() into a method and un-methodize ath_tx_processq(). Now that I understand what's going on with this, I've realised that it's going to be quite difficult to implement a processq method in the EDMA case. Because there's a separate TX status FIFO, I can't just run processq() on each EDMA TXQ to see what's finished. i have to actually run the TX status queue and handle individual TXQs. So: * unmethodize ath_tx_processq(); * leave ath_tx_draintxq() as a method, as it only uses the completion status for debugging rather than actively completing the frames (ie, all frames here are failed); * Methodize ath_draintxq(). The EDMA ath_draintxq() will have to take care of running the TX completion FIFO before (potentially) freeing frames in the queue. The only two places where ath_tx_draintxq() (on a single TXQ) are used: * ath_draintxq(); and * the CABQ handling in the beacon setup code - it drains the CABQ before populating the CABQ with frames for a new beacon (when doing multi-VAP operation.) So it's quite possible that once I methodize the CABQ and beacon handling, I can just drop ath_tx_draintxq() in its entirety. Finally, it's also quite possible that I can remove ath_tx_draintxq() in the future and just "teach" it to not check the status when doing EDMA.
# 239197	11-Aug-2012	adrian	Begin fleshing out the TX FIFO support. * Add ATH_TXQ_FIRST() for easy tasting of what's on the list; * Add an "axq_fifo_depth" for easy tracking of how deep the current FIFO is; * Flesh out the handoff (mcast, hw) functions; * Begin fleshing out a TX ISR proc, which tastes the TX status FIFO. The legacy hardware stuffs the TX completion at the end of the final frame descriptor (or final sub-frame when doing aggregate.) So it's feasible to do a per-TXQ drain and process, as the needed info is right there. For EDMA hardware, there's a separate TX completion FIFO. So the TX process routine needs to read the single FIFO and then process the frames in each hardware queue. This makes it difficult to do a per-queue process, as you'll end up with frames in the TX completion FIFO for a different TXQ to the one you've passed to ath_tx_draintxq() or ath_tx_processq(). Testing: I've tested the TX queue and TX completion code in hostap mode on an AR9380. Beacon frames successfully transmit and the completion routine is called. Occasional data frames end up in TXQ 1 and are also successfully completed. However, this requires some changes to the beacon code path as: * The AR9380 beacon configuration API is now in TU/8, rather than TU; * The AR9380 TX API requires the rate control is setup using a call to setup11nratescenario, rather than having the try0 series setup (rate/tries for the first series); so the beacon won't go out. I'll follow this up with commits to the beacon code.
# 239053	05-Aug-2012	adrian	Migrate the 802.11n ath_hal_chaintxdesc() API to use a buffer/segment array, similar to what filltxdesc() uses. This removes the last reference to ds_data in the TX path outside of debugging statements. These need to be adjusted/fixed. Tested: * AR9280 STA/AP with iperf TCP traffic
# 239051	05-Aug-2012	adrian	Migrate the ath_hal_filltxdesc() API to take a list of buffer/seglen values. The existing API only exposes 'seglen' (the current buffer (segment) length) with the data buffer pointer set in 'ds_data'. This is fine for the legacy DMA engine but it won't work for the EDMA engines. The EDMA engine has a significantly different TX descriptor layout. * The legacy DMA engine had a ds_data pointer at the same offset in the descriptor for both TX and RX buffers; * The EDMA engine has no ds_data for RX - the data is DMAed after the descriptor; * The EDMA engine has support for 4 TX buffer/segment pairs in the TX DMA descriptor; * The EDMA TX completion is in a different FIFO, and the driver will 'link' the status completion entry to a QCU by a "QCU ID". I don't know why it's just not filled in by the hardware, alas. So given that, here are the changes: * Instead of directly fondling 'ds_data' in ath_desc, change the ath_hal_filltxdesc() to take an array of buffer pointers as well as segment len pointers; * The EDMA TX completion status wants a descriptor and queue id. This (for now) uses bf_state.bfs_txq and will extract the hardware QCU ID from that. * .. and this is ugly and wasteful; it should change to just store the QCU in the bf_state and save 3/7 bytes in the process. Now, the weird crap: * The aggregate TX path was using bf_state->bfs_txq for the TXQ, rather than taking a function argument. I've tidied that up. * The multicast queue frames get put on a software TXQ and then that is appended to the hardware CABQ when appropriate. So for now, make sure that bf_state->bfs_txq points at the CABQ when adding frames to the multicast queue. * .. but the multicast queue TX path for now doesn't use the software queue and instead (a) directly sets up the descriptor contents at that point; (b) the frames on the vap->avp_mcastq are then just appended wholesale to the CABQ. So for now, I don't have to worry about making the multicast path work with aggregation or the per-TID software queue. Phew. What's left to do: * I need to modify the 11n ath_hal_chaintxdesc() API to do the same. I'll do that in a subsequent commit. * Remove bf_state.bfs_txq entirely and store the QCU as appropriate. * .. then do the runtime "is this going on the right HWQ?" checks using that, rather than comparing pointer values. Tested on: * AR9280 STA/AP * AR5416 STA/AP
# 238961	31-Jul-2012	adrian	Allow 802.11n hardware to support multi-rate retry when RTS/CTS is enabled. The legacy (pre-802.11n) hardware doesn't support this - although the AR5212 era hardware supports MRR, it doesn't have all the bits needed to support MRR + RTS/CTS. The AR5416 and later support a packet duration and RTS/CTS flags per rate scenario, so we should support it. Tested: * AR9280, STA PR: kern/170302
# 238931	31-Jul-2012	adrian	Migrate some more TX side setup routines to be methods.
# 238855	28-Jul-2012	adrian	Flesh out the initial TX FIFO storage for each hardware TX queue.
# 238838	27-Jul-2012	adrian	Bring this API in line with what the reference driver and Linux ath9k was doing. Obtained from: Qualcomm Atheros, Linux ath9k
# 238836	27-Jul-2012	adrian	Allocate a descriptor ring for EDMA TX completion status. Configure the hardware with said ring physical address and size.
# 238731	23-Jul-2012	adrian	Add a new HAL method - the AR93xx and later NICs have a separate TX descriptor ring for TX status completion. This API call will pass the allocated buffer details to the HAL.
# 238710	23-Jul-2012	adrian	Begin separating out the TX DMA setup in preparation for TX EDMA support. * Introduce TX DMA setup/teardown methods, mirroring what's done in the RX path. Although the TX DMA descriptor is setup via ath_desc_alloc() / ath_desc_free(), there TX status descriptor ring will be allocated in this path. * Remove some of the TX EDMA capability probing from the RX path and push it into the new TX EDMA path.
# 238709	23-Jul-2012	adrian	Flesh out a new DMA map for the EDMA TX completion status, as well as a lock to go with that whole code path.
# 238708	23-Jul-2012	adrian	Begin modifying the descriptor allocation functions to support a variable sized TX descriptor. This is required for the AR93xx EDMA support which requires 128 byte TX descriptors (which is significantly larger than the earlier hardware.)
# 238608	19-Jul-2012	adrian	Use HAL_NUM_RX_QUEUES rather than a magic constant.
# 238607	19-Jul-2012	adrian	Break out the TX descriptor link field into HAL methods. The DMA FIFO chips (AR93xx and later) differ slightly to th elegacy chips: * The RX DMA descriptors don't have a ds_link field; * The TX DMA descriptors have a ds_link field however at a different offset. This is a reimplementation based on what the reference driver and ath9k does. A subsequent commit will enable it in the TX and beacon paths. Obtained from: Linux ath9k, Qualcomm Atheros
# 238436	14-Jul-2012	adrian	Change the RX EDMA path to first complete the FIFO, then re-populate it with fresh descriptors, before handling the frames. Wrap it all in the RX locks. Since the FIFO is very shallow (16 for HP, 128 for LP) it needs to be drained and replenished very quickly. Ideally, I'll eventually move this RX FIFO drain/fill into the interrupt handler, only deferring the actual frame completion.
# 238433	14-Jul-2012	adrian	Create an RX queue lock. Ideally these locks would go away and there'd be a single driver lock, like what iwn(4) does. I'll worry about that later.
# 238316	09-Jul-2012	adrian	Convert sc_rxpending to a per-EDMA queue, and use that for the legacy code. Prepare ath_rx_pkt() to handle multiple RX queues, and default the legacy RX queue to use the HP queue.
# 238284	09-Jul-2012	adrian	Further preparations for the RX EDMA support. Break out the DMA descriptor setup/teardown code into a method. The EDMA RX code doesn't allocate descriptors, just ath_buf entries.
# 238280	09-Jul-2012	adrian	Introduce the EDMA related HAL capabilities. Whilst here, fix a typo in a previous commit. Obtained from: Qualcomm Atheros
# 238278	09-Jul-2012	adrian	Extend the RX HAL API to include the RX queue identifier. The AR93xx and later chips support two RX FIFO queues - a high and low priority queue. For legacy chips, just assume the queues are high priority. This is inspired by the reference driver but is a reimplementation of the API and code.
# 238055	03-Jul-2012	adrian	Begin abstracting out the RX path in preparation for RX EDMA support. The RX EDMA support requires a modified approach to the RX descriptor handling. Specifically: * There's now two RX queues - high and low priority; * The RX queues are implemented as FIFOs; they're now an array of pointers to buffers; * .. and the RX buffer and descriptor are in the same "buffer", rather than being separate. So to that end, this commit abstracts out most of the RX related functions from the bulk of the driver. Notably, the RX DMA/buffer allocation isn't updated, primarily because I haven't yet fleshed out what it should look like. Whilst I'm here, create a set of matching but mostly unimplemented EDMA stubs. Tested: * AR9280, station mode TODO: * Thorough AP and other mode testing for non-EDMA chips; * Figure out how to allocate RX buffers suitable for RX EDMA, including correctly setting the mbuf length to compensate for the RX descriptor and completion status area.
# 237953	02-Jul-2012	adrian	Bring over some further HAL capabilities from the Atheros HAL, as well as an EDMA check function. For the AR9003 and later NICs, different TX/RX DMA and descriptor handling code will be conditional on the EDMA check. Obtained from: Qualcomm Atheros
# 237153	16-Jun-2012	adrian	Shuffle some more fields in ath_buf so it's not too big. This shaves off 20 bytes - from 288 bytes to 268 bytes. However, it's still too big.
# 237152	16-Jun-2012	adrian	Shave four (or eight) bytes off of ath_buf - this field isn't used.
# 237046	14-Jun-2012	adrian	Shrink ath_buf a little more: * Resize some types. In particular, bfs_seqno can be uint16_t for now. Previous work would assign the unassigned seqno a value of -1, which I obviously can't do here. * Remove bfs_pktdur. It was in the original code but nothing so far uses it. This gets ath_buf down (on my i386 system) to 292 bytes from 300 bytes. I'd rather it be much, much smaller.
# 237038	13-Jun-2012	adrian	Implement a global (all non-mgmt traffic) TX ath_buf limitation when ath_start() is called. This (defaults to 10 frames) gives for a little headway in the TX ath_buf allocation, so buffer cloning is still possible. This requires a lot omre experimenting and tuning. It also doesn't stop a node/TID from consuming all of the available ath_buf's, especially when the node is going through high packet loss or only talking at a low TX rate. It also doesn't stop a paused TID from taking all of the ath_bufs. I'll look at fixing that up in subsequent commits. PR: kern/168170
# 237000	13-Jun-2012	adrian	Implement a separate, smaller pool of ath_buf entries for use by management traffic. * Create sc_mgmt_txbuf and sc_mgmt_txdesc, initialise/free them appropriately. * Create an enum to represent buffer types in the API. * Extend ath_getbuf() and _ath_getbuf_locked() to take the above enum. * Right now anything sent via ic_raw_xmit() allocates via ATH_BUFTYPE_MGMT. This may not be very useful. * Add ATH_BUF_MGMT flag (ath_buf.bf_flags) which indicates the current buffer is a mgmt buffer and should go back onto the mgmt free list. * Extend 'txagg' to include debugging output for both normal and mgmt txbufs. * When checking/clearing ATH_BUF_BUSY, do it on both TX pools. Tested: * STA mode, with heavy UDP injection via iperf. This filled the TX queue however BARs were still going out successfully. TODO: * Initialise the mgmt buffers with ATH_BUF_MGMT and then ensure the right type is being allocated and freed on the appropriate list. That'd save a write operation (to bf->bf_flags) on each buffer alloc/free. * Test on AP mode, ensure that BAR TX and probe responses go out nicely when the main TX queue is filled (eg with paused traffic to a TID, awaiting a BAR to complete.) PR: kern/168170
# 236873	11-Jun-2012	adrian	Introduce a new lock debug which is specifically for making sure the _TID_ lock is held. For now the TID lock is also the TXQ lock. This is just to make sure that the right TXQ lock is held for the given TID.
# 236872	11-Jun-2012	adrian	Revert r233227 and followup commits as it breaks CCMP PN replay detection. This showed up when doing heavy UDP throughput on SMP machines. The problem with this is because the 802.11 sequence number is being allocated separately to the CCMP PN replay number (which is assigned during ieee80211_crypto_encap()). Under significant throughput (200+ MBps) the TX path would be stressed enough that frame TX/retry would force sequence number and PN allocation to be out of order. So once the frames were reordered via 802.11 seqnos, the CCMP PN would be far out of order, causing most frames to be discarded by the receiver. I've fixed this in some local work by being forced to: (a) deal with the issues that lead to the parallel TX causing out of order sequence numbers in the first place; (b) fix all the packet queuing issues which lead to strange (but mostly valid) TX. I'll begin fixing these in a subsequent commit or five. PR: kern/166190
# 236599	05-Jun-2012	adrian	Mostly revert previous commit(s). After doing a bunch of local testing, it turns out that it negatively affects performance. I'm stil investigating exactly why deferring the IO causes such negative TCP performance but doesn't affect UDP preformance. Leave the ath_tx_kick() change in there however; it's going to be useful to have that there for if_transmit() work. PR: kern/168649
# 236583	04-Jun-2012	adrian	Migrate the TX path to a taskqueue for now, until a better way of implementing parallel TX and TX/RX completion can be done without simply abusing long-held locks. Right now, multiple concurrent ath_start() entries can result in frames being dequeued out of order. Well, they're dequeued in order fine, but if there's any preemption or race between CPUs between: * removing the frame from the ifnet, and * calling and runningath_tx_start(), until the frame is placed on a software or hardware TXQ Then although dequeueing the frame is in-order, queueing it to the hardware may be out of order. This is solved in a lot of other drivers by just holding a TX lock over a rather long period of time. This lets them continue to direct dispatch without races between dequeue and hardware queue. Note to observers: if_transmit() doesn't necessarily solve this. It removes the ifnet from the main path, but the same issue exists if there's some intermediary queue (eg a bufring, which as an aside also may pull in ifnet when you're using ALTQ.) So, until I can sit down and code up a much better way of doing parallel TX, I'm going to leave the TX path using a deferred taskqueue task. What I will likely head towards is doing a direct dispatch to hardware or software via if_transmit(), but it'll require some driver changes to allow queues to be made without using the really large ath_buf / ath_desc entries. TODO: * Look at how feasible it'll be to just do direct dispatch to ath_tx_start() from if_transmit(), avoiding doing _any_ intermediary serialisation into a global queue. This may break ALTQ for example, so I have to be delicate. * It's quite likely that I should break up ath_tx_start() so it deposits frames onto the software queues first, and then only fill in the 802.11 fields when it's being queued to the hardware. That will make the if_transmit() -> software queue path very quick and lightweight. * This has some very bad behaviour when using ACPI and Cx states. I'll do some subsequent analysis using KTR and schedgraph and file a follow-up PR or two. PR: kern/168649
# 236036	25-May-2012	adrian	Remove an unneeded field from ath_buf.
# 235972	25-May-2012	adrian	oops - ath_hal_disablepcie is actually destined for another purpose, not to disable the PCIe PHY in prepration for reset. Extend the enablepci method to have a "poweroff" flag, which if equal to true means the hardware is about to go to sleep.
# 235957	25-May-2012	adrian	Prepare for improved (read: pcie) suspend/resume support. * Flesh out the pcie disable method for 11n chips, as they were defaulting to the AR5212 (empty) PCIe disable method. * Add accessor macros for the HAL PCIe enable/disable calls. * Call disable on ath_suspend() * Call enable on ath_resume() NOTE: * This has nothing to do with the NIC sleep/run state - the NIC still will stay in network-run state rather than supporting network-sleep state. This is preparation work for supporting correct suspend/resume WARs for the 11n PCIe NICs. TODO: * It may be feasible at this point to keep the chip powered down during initial probe/attach and only power it up upon the first configure/reset pass. This however would require correct (for values of "correct") tracking of the NIC power configuration state from the driver and that just isn't attempted at the moment. Tested: * AR9280 on my Lenovo T60, but with no suspend/resume pass (yet).
# 235804	22-May-2012	adrian	Re-up the TX ath_buf limit from 128 to 512. I'll have to leave this high for now, until I've done some significant surgery with how ath_bufs (and descriptors) are handled. This should significantly cut down on the opportunities for a full TX queue hanging traffic. I'll continue making things work though; I'm mostly doing this for users. :)
# 235774	22-May-2012	adrian	Fix up some corner cases with aggregation handling. I've come across a weird scenario in net80211 where two TX streams will happily attempt to setup an aggregation session together. If we're very lucky, it happens concurrently on separate CPUs and the total lack of locking in the net80211 aggregation code causes this stuff to race. Badly. So >1 call would occur to the ath(4) addba start, but only one call would complete to addba complete or timeout. The TID would thus stay paused. The real fix is to implement some proper per-node (or maybe per-TID) locking in net80211, which then could be leveraged by the ath(4) TX aggregation code. Whilst I'm at it, shuffle around the debugging messages a bit. I like to keep people on their toes.
# 235491	15-May-2012	adrian	Migrate ath_debug and sc_debug from an int to a uint64_t / QUAD; add some more BAR debugging logic. * Change the definition of ath_debug and ath_softc.sc_debug from int to uint64_t; * Change the relevant sysctls; * Add a new BAR TX debugging field; * Use this in if_ath_tx. This has been tested by using the sysctl program, which happily allows for fields > 32 bits to be configured.
# 234873	01-May-2012	adrian	Change the MIB cycle count API to return HAL_BOOL, rather than uint32_t, to return whether it was successful. Add placeholder (blank) methods for previous chips, for both it and the 11n extension channel busy call.
# 234369	17-Apr-2012	adrian	Run the fatal proc as a proc, rather than where it currently is. Otherwise the reset path will sleep, which it can't do in this context.
# 234323	15-Apr-2012	adrian	Drop this down from 512 to 128 for now. This may result in a bit of a throughput drop. However, any throughput drop at this point should be investigated and root caused, as it's likely because TX scheduling (all the way down to how preemption, scheduler work, etc) is happening in a sub-optimal fashion. This also makes it much more likely to be reloadable on a live machine. Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely after a few hours of using X/Chromium.
# 234109	10-Apr-2012	adrian	Convert the flags over to a set of bit flags.
# 234090	10-Apr-2012	adrian	Squirrel away SYNC interrupt debugging if it's enabled in the HAL. Bus errors will show up as various SYNC interrupts which will be passed back up to ath_intr().
# 233967	07-Apr-2012	adrian	Store away the RTS aggregate limit from the HAL. This will be used by some upcoming code to ensure that aggregates are enforced to be a certain size. The AR5416 has a limitation on RTS protected aggregates (8KiB).
# 233966	07-Apr-2012	adrian	Remove duplicate txflags field from ath_buf. rename bf_state.bfs_flags to bf_state.bfs_txflags, as that is what it effectively is.
# 233908	04-Apr-2012	adrian	Implement BAR TX. A BAR frame must be transmitted when an frame in an A-MPDU session fails to transmit - it's retried too often, or it can't be cloned for re-transmission. The BAR frame tells the remote side to advance the left edge of the block-ack window (BAW) to a new value. In order to do this: * TX for that particular node/TID must be paused; * The existing frames in the hardware queue needs to be completed, whether they're TXed successfully or otherwise; * The new left edge of the BAW is then communicated to the remote side via a BAR frame; * Once the BAR frame has been sucessfully TXed, aggregation can resume; * If the BAR frame can't be successfully TXed, the aggregation session is torn down. This is a first pass that implements the above. What needs to be done/ tested: * What happens during say, a channel reset / stuck beacon _and_ BAR TX. It _should_ be correctly buffered and retried once the reset has completed. But if a bgscan occurs (and they shouldn't, grr) the BAR frame will be forcibly failed and the aggregation session will be torn down. Yes, another reason to disable bgscan until I've figured this out. * There's way too much locking going on here. I'm going to do a couple of further passes of sanitising and refactoring so the (re) locking isn't so heavy. Right now I'm going for correctness, not speed. * The BAR TX can fail if the hardware TX queue is full. Since there's no "free" space kept for management frames, a full TX queue (from eg an iperf test) can race with your ability to allocate ath_buf/mbufs and cause issues. I'll knock this on the head with a subsequent commit. * I need to do some _much_ more thorough testing in hostap mode to ensure that many concurrent traffic streams to different end nodes are correctly handled. I'll find and squish whichever bugs show up here. But, this is an important step to being able to flip on 802.11n by default. The last issue (besides bug fixes, of course) is HT frame protection and I'll address that in a subsequent commit.
# 233895	04-Apr-2012	adrian	Correctly handle AR_MoreAggr when assembling multi-descriptor final frames. Linux ath9k doesn't have this issue as it doesn't try queuing multi- descriptor frames to the hardware. Before, I was only setting the first and last descriptor in the final frame correctly - and that was done by accident. The first descriptor in the last sub-frame was being correctly updated by ath_tx_setds_11n(); the last descriptor in the last sub-frame was being correctly updated by ath_buf_set_rate(). But both of those are "incorrect". The correct behaviour is: * AR_IsAggr is set for all descriptors for all subframes in an aggregate. * AR_MoreAggr is set for all descriptors for all non-final sub-frames in an aggregate. Ie, all descriptors in the last sub-frame of an aggregate must have this field set to 0. I still need to do a couple of extra passes to ensure the pad delimiter field is being correctly handled in all descriptors in the last sub-frame.
# 233673	29-Mar-2012	adrian	Defer the rescheduling of TID -> TXQ frames in some instances. Right now ath_txq_sched() is mainly called from the TX ath_tx_processq() routine, which is (mostly) done as part of the taskqueue. It shouldn't be called outside the taskqueue. But now that I'm about to flip back on BAR TX, I'm going to start stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths. What I don't want to have happen is a reschedule of the TID traffic _during_ the completion of TX frames. Ideally I'd like to have a way to flag back up to the processing code that the current hardware queue should be rechecked for software TID queue frames. But for now, this should suffice for the BAR TX case. I may eventually delete this code once I've brought some further sanity to the general TX queue/completion path.
# 233227	20-Mar-2012	adrian	Delay sequence number allocation for A-MPDU until just before the frame is queued to the hardware. Because multiple concurrent paths can execute ath_start(), multiple concurrent paths can push frames into the software/hardware TX queue and since preemption/interrupting can occur, there's the possibility that a gap in time will occur between allocating the sequence number and queuing it to the hardware. Because of this, it's possible that a thread will have allocated a sequence number and then be preempted by another thread doing the same. If the second thread sneaks the frame into the BAW, the (earlier) sequence number of the first frame will be now outside the BAW and will result in the frame being constantly re-added to the tail of the queue. There it will live until the sequence numbers cycle around again. This also creates a hole in the RX BAW tracking which can also cause issues. This patch delays the sequence number allocation to occur only just before the frame is going to be added to the BAW. I've been wanting to do this anyway as part of a general code tidyup but I've not gotten around to it. This fixes the PR. However, it still makes it quite difficult to try and ensure in-order queuing and dequeuing of frames. Since multiple copies of ath_start() can be run at the same time (eg one TXing process thread, one TX completion task/one RX task) the driver may end up having frames dequeued and pushed into the hardware slightly/occasionally out of order. And, to make matters more annoying, net80211 may have the same behaviour - in the non-aggregation case, the TX code allocates sequence numbers before it's thrown to the driver. I'll open another PR to investigate this and potentially introduce some kind of final-pass TX serialisation before frames are thrown to the hardware. It's also very likely worthwhile adding some debugging code into ath(4) and net80211 to catch when/if this does occur. PR: kern/166190
# 232794	10-Mar-2012	adrian	Fix a panic introduced in a previous commit - non-beaconing modes (eg STA) don't setup the avp mcast queue. This is a bit annoying though - it turns out the mcast queue isn't initialised for STA mode but it's then touched to see whether anything is in it. That should be fixed in a subsequent commit. Noticed by: gperez@entel.upc.edu PR: kern/165895
# 232764	10-Mar-2012	adrian	Don't flood the cabq/mcastq with frames. In a very noisy 2.4GHz environment (with HT/40 enabled, making it worse) I saw the following occur: * the air was considered "busy" a lot of the time; * the cabq time is quite short due to staggered beacons being enabled; * it just wasn't able to keep up TX'ing CABQ frames; * .. and the cabq would swallow up all the TX ath_buf's. This patch introduces a twiddle which allows the maximum cabq depth to be set, forcing further frames to be dropped. It defaults to the TX buffer count at the moment, so the default behaviour isn't changed. I've also started fleshing out a similar setup for the data path, so it doesn't swallow up all the available TX buffers and preventing management frames (such as ADDBA) out. PR: kern/165895
# 232163	25-Feb-2012	adrian	Attempt to further fix some of the concurrency/reset issues that occur. * ath_reset() is being called in softclock context, which may have the thing sleep on a lock. To avoid this, since we really _shouldn't_ be sleeping on any locks, break out the no-loss reset path into a tasklet and call that from: + ath_calibrate() + ath_watchdog() This has the added advantage that it'll end up also doing the frame RX cleanup from within the taskqueue context, rather than the softclock context. * Shuffle around the taskqueue_block() call to be before we grab the lock and disable interrupts. The trouble here is that taskqueue_block() doesn't block currently queued (but not yet running) tasks so calling it doesn't guarantee no further tasks (that weren't running on _A_ CPU at the time of this call) will complete. Calling taskqueue_drain() on these tasks won't work because if any _other_ thread calls taskqueue_enqueue() for whatever reason, everything gets very angry and stops working. This slightly changes the race condition enough to let ath_rx_tasklet() run before we try disabling it, and thus quietens the warnings a bit. The (more) true solution will be doing something like the following: * having a taskqueue_blocked mask in ath_softc; * having an interrupt_blocked mask in ath_softc; * only calling taskqueue_drain() on each individual task _after_ the lock has been acquired - that way no further tasklet scheduling is going to occur. * Then once the tasks have been blocked _and_ the interrupt has been disabled, call taskqueue_drain() on each, ensuring that anything that _was_ scheduled or running is removed. The trouble is if something calls taskqueue_enqueue() on a task after taskqueue_blocked() has been called but BEFORE taskqueue_drain() has been called, ta_pending will be set to 1 and taskqueue_drain() will sit there stuck in msleep() until you hard-kill the machine. PR: kern/165382 PR: kern/165220
# 231369	10-Feb-2012	adrian	Add in a new driver feature to allow the TX and RX chainmask to be overridden at attach time. Some 802.11n NICs may only have one physical antenna connected. The radios will be very upset if you try enabling radios which aren't connected to antennas. This allows hints to override the TX and RX chainmask. These hints are: hint.ath.X.rx_chainmask hint.ath.X.tx_chainmask They can be set at either boot time or in kenv before the module is loaded. This and the previous HAL commit were sponsored in late 2011 by Hobnob, Inc. Sponsored by: Hobnob, Inc.
# 230493	24-Jan-2012	adrian	Fix up some style(9) indenting and reorganise some of the hal methods. There should be no functional change due to this commit.
# 230492	24-Jan-2012	adrian	Add a missing HAL method macro. I'm using this as part of some personal DFS radar stuff.
# 228891	26-Dec-2011	adrian	Flesh out configurable hardware based LED blinking. The hardware (MAC) LED blinking involves a few things: * Selecting which GPIO pins map to the MAC "power" and "network" lines; * Configuring the MAC LED state (associated, scanning, idle); * Configuring the MAC LED blinking type and speed. The AR5416 HAL configures the normal blinking setup - ie, blink rate based on TX/RX throughput. The default AR5212 HAL doesn't program in any specific blinking type, but the default of 0 is the same. This code introduces a few things: * The hardware led override is configured via sysctl 'hardled'; * The MAC network and power LED GPIO lines can be set, or left at -1 if needed. This is intended to allow only one of the hardware MUX entries to be configured (eg for PCIe cards which only have one LED exposed.) TODO: * For AR2417, the software LED blinking involves software blinking the Network LED. For the AR5416 and later, this can just be configured as a GPIO output line. I'll chase that up with a subsequent commit. * Add another software LED blink for "Link", separate from "activity", which blinks based on the association state. This would make my D-Link DWA-552 have consistent and useful LED behaviour (as they're marked "Link" and "Activity." * Don't expose the hardware LED override unless it's an AR5416 or later, as the previous generation hardware doesn't have this multiplexing setup.
# 227651	18-Nov-2011	adrian	Flesh out some slightly dirty reset/channel change serialisation code for the ath(4) driver. Currently, there's nothing stopping reset, channel change and general TX/RX from overlapping with each other. This wasn't a big deal with pre-11n traffic as it just results in some dropped frames. It's possible this may have also caused some inconsistencies and badly-setup hardware. Since locks can't be held across all of this (the Linux solution) due to LORs with the network stack locks, some state counter variables are used to track what parts of the code the driver is currently in. When the hardware is being reset, it disables the taskqueue and waits for pending interrupts, tx, rx and tx completion before it begins the reset or channel change. TX and RX both abort if called during an active reset or channel change. Finally, the reset path now doesn't flush frames if ATH_RESET_NOLOSS is set. Instead, completed TX and RX frames are passed back up to net80211 before the reset occurs. This is not without problems: * Raw frame xmit are just dropped, rather than placed on a queue. The net80211 stack should be the one which queues these frames rather than the driver. * It's all very messy. It'd be better if these hardware operations were serialised on some kind of work queue, rather than hoping they can be run in parallel. * The taskqueue block/unblock may occur in parallel with the newstate() function - which shuts down the taskqueue and restarts it once the new state is known. It's likely these operations should be refcounted so the taskqueue is restored once no other areas in the code wish to suspend operations. * .. interrupt disable/enable should likely be refcounted as well. With this work, the driver does not drop frames during stuck beacon or fatal errors and thus 11n traffic continues to run correctly. Default and full resets however do still drop frames and it's possible this may occur, causing traffic loss and session stalls. Sponsored by: Hobnob, Inc.
# 227346	08-Nov-2011	adrian	Merge in some fixes from the if_ath_tx branch. * Close down some of the kickpcu races, where the interrupt handler can and will run concurrently with the taskqueue. * Close down the TXQ active/completed race between the interrupt handler and the concurrently running tx completion taskqueue function. * Add some tx and rx interrupt count tracking, for debugging. * Fix the kickpcu logic in ath_rx_proc() to not simply drain and restart the TX queue - instead, assume the hardware isn't (too) confused and just restart RX DMA. This may break on previous chipsets, so if it does I'll add a HAL flag and conditionally handle this (ie, for broken chipsets, I'll just restore the "stop PCU / flush things / restart PCU" logic.) * Misc stuff Sponsored by: Hobnob, Inc.
# 227344	08-Nov-2011	adrian	Migrate the STAILQ lists to TAILQs. A bunch of the 11n TX aggregation logic wants to traverse lists of buffers in various ways. In order to provide O(1) behaviour in this instance, use TAILQs. This does blow out the memory footprint and CPU cycles slightly for some of these operations. I may convert some of these back to STAILQs once the rest of the software transmit queue handling has been stabilised. Sponsored by: Hobnob, Inc.
# 227328	08-Nov-2011	adrian	Begin merging in some of my 802.11n TX aggregation driver changes. * Add a PCU lock, which isn't currently used but will eventually be used to serialise some of the driver access. * Add in all the software TX aggregation state, that's kept per-node and per-TID. * Add in the software and aggregation state to ath_buf. * Add in hooks to ath_softc for aggregation state and the (upcoming) aggregation TX state calls. * Add / fix the HAL access macros. Obtained from: Linux, ath9k Sponsored by: Hobnob, Inc.
# 225818	28-Sep-2011	adrian	Update the default AIFS value for hostap mode. Obtained from: Linux ath9k, Atheros reference
# 225444	07-Sep-2011	adrian	Update the TSF and next-TBTT methods to work for the AR5416 and later NICs. This is another commit in a series of TDMA support fixes for the 11n NICs. * Move ath_hal_getnexttbtt() into the HAL; write methods for it. This returns a timer value in TSF, rather than TU. * Move ath_hal_getcca() and ath_hal_setcca() into the HAL too, where they likely now belong. * Create a new HAL capability: HAL_CAP_LONG_RXDESC_TSF. The pre-11n NICs write 15 bit TSF snapshots into the RX descriptor; the AR5416 and later write 32 bit TSF snapshots into the RX descriptor. * Use the new capability to choose between 15 and 31 bit TSF adjustment functions in ath_extend_tsf(). * Write ar5416GetTsf64() and ar5416SetTsf64() methods. ar5416GetTsf64() tries to compensate for TSF changes at the 32 bit boundary. According to yin, this fixes the TDMA beaconing on 11n chipsets and TDMA stations can now associate/talk, but there are still issues with traffic stability which need to be investigated. The ath_hal_extendtsf() function is also used in RX packet timestamping; this may improve adhoc mode on the 11n chipsets. It also will affect the timestamps seen in radiotap frames. Submitted by: Kang Yin Su <cantona@cantona.net> Approved by: re (kib)
# 224720	08-Aug-2011	adrian	And add another missing brace. Another pointy hat moment. This one however isn't used by any public code yet, so it didn't break the build. Approved by: re (kib, blanket)
# 224715	08-Aug-2011	adrian	.. and add a missing bracket. Approved by: re (kib, blanket)
# 224714	08-Aug-2011	adrian	Fix method naming to match the reference HAL definition. Obtained from: Atheros Approved by: re (kib, blanket)
# 224709	08-Aug-2011	adrian	Add another HAL method - ah_isFastClockEnabled - which returns AH_TRUE if 5ghz fast clock is enabled in the current operating mode. It's slightly dirty, but it's part of the reference HAL and used by the (currently closed-source) radar event code to map radar pulses back to microsecond durations. Obtained from: Atheros Approved by: re (kib, blanket)
# 224588	02-Aug-2011	adrian	Fix a corner case in RXEOL handling which was likely introduced by yours truly. Before 802.11n, the RX descriptor list would employ the "self-linked tail descriptor" trick which linked the last descriptor back to itself. This way, the RX engine would never hit the "end" of the list and stop processing RX (and assert RXEOL) as it never hit a descriptor whose next pointer was 0. It would just keep overwriting the last descriptor until the software freed up some more RX descriptors and chained them onto the end. For 802.11n, this needs to stop as a self-linked RX descriptor tickles the block-ack logic into ACK'ing whatever frames are received into that self-linked descriptor - so in very busy periods, you could end up with A-MPDU traffic that is ACKed but never received by the 802.11 stack. This would cause some confusion as the ADDBA windows would suddenly be out of sync. So when that occured here, the last descriptor would be hit and the PCU logic would stop. It would only start again when the RX descriptor list was updated and the PCU RX engine was re-tickled. That wasn't being done, so RXEOL would be continuously asserted and no RX would continue. This patch introduces a new flag - sc->sc_kickpcu - which when set, signals the RX task to kick the PCU after its processed whatever packets it can. This way completed packets aren't discarded. In case some other task gets called which resets the hardware, don't update sc->sc_imask - instead, just update the hardware interrupt mask directly and let either ath_rx_proc() or ath_reset() restore the imask to its former setting. Note: this bug was only triggered when doing a whole lot of frame snooping with serial console IO in the RX task. This would defer interrupt processing enough to cause an RX descriptor overflow. It doesn't happen in normal conditions. Approved by: re (kib, blanket)
# 224540	31-Jul-2011	adrian	Fix typo! Approved by: re (kib)
# 222815	07-Jun-2011	adrian	Flesh out a new HAL method to fetch the radar PHY error frame information. For the AR5211/AR5212, this is apparently a one byte pulse duration counter value. It is only coded up here for the AR5212 as I don't have any AR5211-series hardware to test it on. This information was extracted from the Madwifi DFS branch along with some local additions. Please note - all this does is extract out the radar event duration, it in no way reflects the presence of a radar. Further code is needed to take a set of radar events and filter them to extract out correct radar pulse trains (and ignore other events.) For further information, please see: http://wiki.freebsd.org/dev/ath_hal%284%29/RadarDetection This includes references to the relevant patents which describe what is going on. Obtained from: Madwifi
# 222668	04-Jun-2011	adrian	A few changes to make radar detection implementable in a hal_dfs/ module. * If sc->sc_dodfs is set to 1 by the ath_dfs_radar_enable(), set the relevant rx filter bit to begin receiving radar PHY errors. The HAL code already knows how to set the relevant error mask register to enable radar events. * Add a missing call to ath_dfs_radar_enable() after ath_hal_reset() * change ath_dfs_process_phyerr() to take a const char *buf for now, rather than a descriptor. This way it can get access to the packet buffer contents.
# 222585	01-Jun-2011	adrian	Flesh out the radar detection related operations for the ath driver. This is in no way a complete DFS/radar detection implementation! It merely creates an abstracted interface which allows for future development of the DFS radar detection code. Note: Net80211 already handles the bulk of the DFS machinery, all we need to do here is figure out that a radar event has occured and inform it as such. It then drives the DFS state engine for us. The "null" DFS radar detection module is included by default; it doesn't require a device line. This commit: * Adds a simple abstracted layer for radar detection state - sys/dev/ath/ath_dfs/; * Implements a null DFS module which doesn't do anything; (ie, implements the exact behaviour at the moment); * Adds hooks to the ath driver to process received radar events and gives the DFS module a chance to determine whether a radar has been detected. Obtained from: Atheros
# 222277	25-May-2011	adrian	The current ANI capability information uses a different set of values for the commands, compared to the internal command values (HAL_ANI_CMD.) My eventual aim is to make the HAL_ANI_CMD internal enum match the public API and then remove all this messiness. This now allows HAL_CAP_INTMIT users to use a public HAL_CAP_INTMIT_ enum rather than magic constants. The only magic constants currently used by if_ath are "enable" and "present". Some local tools of mine allow for direct, manual fiddling of the ANI variables and I'll convert these to use the public enum API before I commit them.
# 220772	18-Apr-2011	adrian	Add global TX timeout handling. The global TX timeout counter increments whenever a frame is ready to be transmitted and the medium is busy.
# 220324	04-Apr-2011	adrian	Add a HAL capability bit for supporting self-linked RX descriptors and disable it for the 11n chipsets. From the ath9k source: == 11N: we can no longer afford to self link the last descriptor. MAC acknowledges BA status as long as it copies frames to host buffer (or rx fifo). This can incorrectly acknowledge packets to a sender if last desc is self-linked. == Since this is useful for pre-AR5416 chips that communicate PHY errors via error frames rather than by on-chip counters, leave the support in there, but disable it for AR5416 and later.
# 220053	27-Mar-2011	adrian	Rename AH_ENABLE_11N to ATH_ENABLE_11 - the HAL supports 11n by default but the ath driver doesn't. This is a much more consistent name.
# 220033	26-Mar-2011	adrian	If 802.11n is enabled, bump the number of buffers used up to a larger level. This is important for AMPDU RX as each burst is multiple packets in a row.
# 218490	09-Feb-2011	adrian	Expose the 4k transaction workaround hooks to the driver, but don't (yet) fix the descriptor allocation.
# 218151	01-Feb-2011	adrian	Add TX/RX chainmask info to if_ath - this is needed for the 11n TX rate series.
# 218067	29-Jan-2011	adrian	Fix some errors introduced w/ the last commit; fix setting RTS/CTS in the 11n rate scenario. * I messed up a couple of things in if_athvar.h; so fix that. * Undo some guesswork done in ar5416Set11nRateScenario() and introduce a flags parameter which lets the caller set a few things. To begin with, this includes whether to do RTS or CTS protection. * If both RTS and CTS is set, only do RTS. Both RTS and CTS shouldn't be set on a frame.
# 218066	29-Jan-2011	adrian	Link in the 11n specific TX methods into the HAL.
# 217684	21-Jan-2011	adrian	ANI changes #1 - split out the ANI polling from the RxMonitor hook. The rxmonitor hook is called on each received packet. This can get very, very busy as the tx/rx/chanbusy registers are thus read each time a packet is received. Instead, shuffle out the true per-packet processing which is needed and move the rest of the ANI processing into a periodic event which runs every 100ms by default.
# 217627	20-Jan-2011	adrian	Add in the public method to access the tx completion rates.
# 217624	20-Jan-2011	adrian	Include the initial support for external EEPROMs. The AR9100 at least doesn't have an external serial EEPROM attached to the MAC; it instead stores the calibration data in the normal system flash. I believe earlier parts can do something similar but I haven't experienced it first-hand. This commit introduces an eepromdata pointer into the API but doesn't at all commit to using it. A future commit will include the glue needed to allow the AR9100 support code to use this data pointer as the EEPROM.
# 203683	08-Feb-2010	rpaulo	Add multicast key search support. This fixes corrupted mcast packets when we have more than one hostap vap. Submitted by: Russell Yount <russell.yount at gmail.com> MFC after: 2 weeks
# 195807	21-Jul-2009	sam	track whether any mesh vaps are present to correctly setup the rx filter when, for example, an ap vap is created first Reviewed by: rpaulo Approved by: re (kib)
# 195618	11-Jul-2009	rpaulo	Implementation of the upcoming Wireless Mesh standard, 802.11s, on the net80211 wireless stack. This work is based on the March 2009 D3.0 draft standard. This standard is expected to become final next year. This includes two main net80211 modules, ieee80211_mesh.c which deals with peer link management, link metric calculation, routing table control and mesh configuration and ieee80211_hwmp.c which deals with the actually routing process on the mesh network. HWMP is the mandatory routing protocol on by the mesh standard, but others, such as RA-OLSR, can be implemented. Authentication and encryption are not implemented. There are several scripts under tools/tools/net80211/scripts that can be used to test different mesh network topologies and they also teach you how to setup a mesh vap (for the impatient: ifconfig wlan0 create wlandev ... wlanmode mesh). A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled by default on GENERIC kernels for i386, amd64, sparc64 and pc98. Drivers that support mesh networks right now are: ath, ral and mwl. More information at: http://wiki.freebsd.org/WifiMesh Please note that this work is experimental. Also, please note that bridging a mesh vap with another network interface is not yet supported. Many thanks to the FreeBSD Foundation for sponsoring this project and to Sam Leffler for his support. Also, I would like to thank Gateworks Corporation for sending me a Cambria board which was used during the development of this project. Reviewed by: sam Approved by: re (kensmith) Obtained from: projects/mesh11s
# 195114	27-Jun-2009	sam	Add HAL_RX_FILTER_BSSID support (to disable bssid match): o add HAL_CAP_BSSIDMATCH to identify parts that have the support for disabling bssid match o honor capability for set/get rx filter o use HAL_CAP_BSSIDMATCH in driver to decide whether to use the bssid match disable or fall back to promisc mode Reviewed by: rpaulo Approved by: re (rwatson)
# 192468	20-May-2009	sam	Overhaul monitor mode handling: o replace DLT_IEEE802_11 support in net80211 with DLT_IEEE802_11_RADIO and remove explicit bpf support from wireless drivers; drivers now use ieee80211_radiotap_attach to setup shared data structures that hold the radiotap header for each packet tx/rx o remove rx timestamp from the rx path; it was used only by the tdma support for debugging and was mostly useless due to it being 32-bits and mostly unavailable o track DLT_IEEE80211_RADIO bpf attachments and maintain per-vap and per-com state when there are active taps o track the number of monitor mode vaps o use bpf tap and monitor mode vap state to decide when to collect radiotap state and dispatch frames; drivers no longer explicitly directly check bpf state or use bpf calls to tap frames o handle radiotap state updates on channel change in net80211; drivers should not do this (unless they bypass net80211 which is almost always a mistake) o update various drivers to be more consistent/correct in handling radiotap o update ral to include TSF in radiotap'd frames o add promisc mode callback to wi Reviewed by: cbzimmer, rpaulo, thompsa
# 190848	08-Apr-2009	sam	remove unused struct member
# 190579	30-Mar-2009	sam	Hoist 802.11 encapsulation up into net80211: o call ieee80211_encap in ieee80211_start so frames passed down to drivers are already encapsulated o remove ieee80211_encap calls in drivers o fixup wi so it recreates the 802.3 head it requires from the 802.11 header contents o move fast-frame aggregation from ath to net80211 (conditional on IEEE80211_SUPPORT_SUPERG): - aggregation is now done in ieee80211_start; it is enabled when the packets/sec exceeds ieee80211_ffppsmin (net.wlan.ffppsmin) and frames are held on a staging queue according to ieee80211_ffagemax (net.wlan.ffagemax) to wait for a frame to combine with - drivers must call back to age/flush the staging queue (ath does this on tx done, at swba, and on rx according to the state of the tx queues and/or the contents of the staging queue) - remove fast-frame-related data structures from ath - add ieee80211_ff_node_init and ieee80211_ff_node_cleanup to handle per-node fast-frames state (we reuse 11n tx ampdu state) o change ieee80211_encap calling convention to include an explicit vap so frames coming through a WDS vap are recognized w/o setting M_WDS With these changes any device able to tx/rx 3Kbyte+ frames can use fast-frames. Reviewed by: thompsa, rpaulo, avatar, imp, sephe
# 190571	30-Mar-2009	sam	Remove ATH_SUPPORT_TDMA and use IEEE80211_SUPPORT_TDMA instead. It doesn't make much sense to configure driver support w/o net80211. Note this means ath now depends on opt_wlan.h.
# 190096	19-Mar-2009	sam	purge hal abi support; now that the hal is merged w/ the driver we cannot be out of sync MFC after: 1 week
# 189605	09-Mar-2009	sam	replace if_watchdog w/ private callout; probably can merge this with the calibration work sometime in the future
# 189380	04-Mar-2009	sam	add a sysctl to ena/dis frobbing cca
# 188974	23-Feb-2009	sam	5416 and later parts mux the gpio outputs; extend the api to include a signal type that's used to select the appropriate mux
# 188783	19-Feb-2009	sam	remove private support for IEEE80211_MODE_HALF and IEEE80211_MODE_QUARTER now that net80211 has them
# 187831	28-Jan-2009	sam	Overhaul regulatory support: o remove HAL_CHANNEL; convert the hal to use net80211 channels; this mostly involves mechanical changes to variable names and channel attribute macros o gut HAL_CHANNEL_PRIVATE as most of the contents are now redundant with the net80211 channel available o change api for ath_hal_init_channels: no more reglass id's, no more outdoor indication (was a noop), anM contents o add ath_hal_getchannels to have the hal construct a channel list without altering runtime state; this is used to retrieve the calibration list for the device in ath_getradiocaps o add ath_hal_set_channels to take a channel list and regulatory data from above and construct internal state to match (maps frequencies for 900MHz cards, setup for CTL lookups, etc) o compact the private channel table: we keep one private channel per frequency instead of one per HAL_CHANNEL; this gives a big space savings and potentially improves ani and calibration by sharing state (to be seen; didn't see anything in testing); a new config option AH_MAXCHAN controls the table size (default to 96 which was chosen to be ~3x the largest expected size) o shrink ani state and change to mirror private channel table (one entry per frequency indexed by ic_devdata) o move ani state flags to private channel state o remove country codes; use net80211 definitions instead o remove GSM regulatory support; it's no longer needed now that we pass in channel lists from above o consolidate ADHOC_NO_11A attribute with DISALLOW_ADHOC_11A o simplify initial channel list construction based on the EEPROM contents; we preserve country code support for now but may want to just fallback to a WWR sku and dispatch the discovered country code up to user space so the channel list can be constructed using the master regdomain tables o defer to net80211 for max antenna gain o eliminate sorting of internal channel table; now that we use ic_devdata as an index, table lookups are O(1) o remove internal copy of the country code; the public one is sufficient o remove AH_SUPPORT_11D conditional compilation; we always support 11d o remove ath_hal_ispublicsafetysku; not needed any more o remove ath_hal_isgsmsku; no more GSM stuff o move Conformance Test Limit (CTL) state from private channel to a lookup using per-band pointers cached in the private state block o remove regulatory class id support; was unused and belongs in net80211 o fix channel list construction to set IEEE80211_CHAN_NOADHOC, IEEE80211_CHAN_NOHOSTAP, and IEEE80211_CHAN_4MSXMIT o remove private channel flags CHANNEL_DFS and CHANNEL_4MS_LIMIT; these are now set in the constructed net80211 channel o store CHANNEL_NFCREQUIRED (Noise Floor Required) channel attribute in one of the driver-private flag bits of the net80211 channel o move 900MHz frequency mapping into the hal; the mapped frequency is stored in the private channel and used throughout the hal (no more mapping in the driver and/or net80211) o remove ath_hal_mhz2ieee; it's no longer needed as net80211 does the calculation and available in the net80211 channel o change noise floor calibration logic to work with compacted private channel table setup; this may require revisiting as we no longer can distinguish channel attributes (e.g. 11b vs 11g vs turbo) but since the data is used only to calculate status data we can live with it for now o change ah_getChipPowerLimits internal method to operate on a single channel instead of all channels in the private channel table o add ath_hal_gethwchannel to map a net80211 channel to a h/w frequency (always the same except for 900MHz channels) o add HAL_EEBADREG and HAL_EEBADCC status codes to better identify regulatory problems o remove CTRY_DEBUG and CTRY_DEFAULT enum's; these come from net80211 now o change ath_hal_getwirelessmodes to really return wireless modes supported by the hardware (was previously applying regulatory constraints) o return channel interference status with IEEE80211_CHANSTATE_CWINT (should change to a callback so hal api's can take const pointers) o remove some #define's no longer needed with the inclusion of <net80211/_ieee80211.h> Sponsored by: Carlson Wireless
# 186904	08-Jan-2009	sam	TDMA support for long distance point-to-point links using ath devices: o add net80211 support for a tdma vap that is built on top of the existing adhoc-demo support o add tdma scheduling of frame transmission to the ath driver; it's conceivable other devices might be capable of this too in which case they can make use of the 802.11 protocol additions etc. o add minor bits to user tools that need to know: ifconfig to setup and configure, new statistics in athstats, and new debug mask bits While the architecture can support >2 slots in a TDMA BSS the current design is intended (and tested) for only 2 slots. Sponsored by: Intel
# 185744	07-Dec-2008	sam	New periodic calibration scheme needed for 11n parts that have multiple algorithms and potentially collect multiple samples. Instead of a single calibration interval we now have short and long intervals; the long interval roughly corresponds to the previous single interval. The short interval is used to speedup collection of samples and happens much quicker. We make calls using the short interval until we're told the calibration work is complete at which point we fallback to the long interval. In addition there is a much longer reset interval used to flush all calibration state and cause everthing to start anew. With these changes you can also disable calibration entirely by setting the long interval to zero.
# 185522	01-Dec-2008	sam	Switch to ath hal source code. Note this removes the ath_hal module; the ath module now brings in the hal support. Kernel config files are almost backwards compatible; supplying device ath_hal gives you the same chip support that the binary hal did but you must also include options AH_SUPPORT_AR5416 to enable the extended format descriptors used by 11n parts. It is now possible to control the chip support included in a build by specifying exactly which chips are to be supported in the config file; consult ath_hal(4) for information.
# 185242	23-Nov-2008	sam	nuke special handling of RXORN interrupt; the hal marks the FATAL bit in the interrupt status when RXORN is hit and the chip requires a reset so our special handling was causing useless resets
# 184369	27-Oct-2008	sam	prepare for a new hal
# 184368	27-Oct-2008	sam	o With the addition of HT rates the set of h/w codes has a much wider range making the use of sc_hwmap to do direct mapping impractical. Switch to indexing by the rate index instead of the rate code and adjust associated state and logic appropriately. This has several benefits including simplification of the led code. o fix radiotap capture of HT rates o fix conditional compilation of HT radiotap support to be based on the hal having 5416 support; not the ABI version as hal builds may or may not include 5416 support
# 184358	27-Oct-2008	sam	Fixup statistics: o update tx rssi data only when an ACK was received o return tx rssi from sampled data instead of the last frame o track noise floor o return rx rssi and noise floor (was broken)
# 184354	27-Oct-2008	sam	add sys.dev.ath.X.intmit knob to enable/disable ANI (the intmit name is historical)
# 184351	27-Oct-2008	sam	rename bf_flags to bf_txflags in preparation for the addition of flags separate from the tx descriptor flags currently recorded
# 184347	27-Oct-2008	sam	remove driver-private equivalent of ni_txparms; it's now superfluous
# 183221	20-Sep-2008	sam	fix memory smash on lp64 platforms; mostly noticeable in user mode as being unable to associate
# 182893	09-Sep-2008	rpaulo	Update for new HAL. Reviewed by: sam
# 179401	28-May-2008	sam	Cleanup power handling and fix suspend/resume: o do not put the chip into full sleep in ath_stop as it gains nothing and causes many parts to hang in ath_detach because we may touch the chip during vap teardown; this may also fix issues with unloading the module o add a note in ath_detach to explain ath_hal_detach puts the chip in low power mode; this is useful to know as it means unloading the module will place a pci device in the lowest possible power state o leave an #ifdef notyet marker for powering down the chip when a device is marked down; we can't do that until we handle all the ways the driver may be entered and touch the chip o fix resume by reloading the h/w key cache as it's been clobbered (for pci) by the socket being powered off; for station mode we directly stop+init the chip and then simulate a beacon miss to get the upper layers sync'd up; for other configs we must brute force stop+start the vaps so they go through the state machine
# 178751	03-May-2008	sam	add back sysctl's to display the regdomain and country code from eeprom; useful for debugging
# 178354	20-Apr-2008	sam	Multi-bss (aka vap) support for 802.11 devices. Note this includes changes to all drivers and moves some device firmware loading to use firmware(9) and a separate module (e.g. ral). Also there no longer are separate wlan_scan* modules; this functionality is now bundled into the wlan module. Supported by: Hobnob and Marvell Reviewed by: many Obtained from: Atheros (some bits)
# 170530	11-Jun-2007	sam	Update 802.11 wireless support: o major overhaul of the way channels are handled: channels are now fully enumerated and uniquely identify the operating characteristics; these changes are visible to user applications which require changes o make scanning support independent of the state machine to enable background scanning and roaming o move scanning support into loadable modules based on the operating mode to enable different policies and reduce the memory footprint on systems w/ constrained resources o add background scanning in station mode (no support for adhoc/ibss mode yet) o significantly speedup sta mode scanning with a variety of techniques o add roaming support when background scanning is supported; for now we use a simple algorithm to trigger a roam: we threshold the rssi and tx rate, if either drops too low we try to roam to a new ap o add tx fragmentation support o add first cut at 802.11n support: this code works with forthcoming drivers but is incomplete; it's included now to establish a baseline for other drivers to be developed and for user applications o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates prepending mbufs for traffic generated locally o add support for Atheros protocol extensions; mainly the fast frames encapsulation (note this can be used with any card that can tx+rx large frames correctly) o add sta support for ap's that beacon both WPA1+2 support o change all data types from bsd-style to posix-style o propagate noise floor data from drivers to net80211 and on to user apps o correct various issues in the sta mode state machine related to handling authentication and association failures o enable the addition of sta mode power save support for drivers that need net80211 support (not in this commit) o remove old WI compatibility ioctls (wicontrol is officially dead) o change the data structures returned for get sta info and get scan results so future additions will not break user apps o fixed tx rate is now maintained internally as an ieee rate and not an index into the rate set; this needs to be extended to deal with multi-mode operation o add extended channel specifications to radiotap to enable 11n sniffing Drivers: o ath: add support for bg scanning, tx fragmentation, fast frames, dynamic turbo (lightly tested), 11n (sniffing only and needs new hal) o awi: compile tested only o ndis: lightly tested o ipw: lightly tested o iwi: add support for bg scanning (well tested but may have some rough edges) o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data o wi: lightly tested This work is based on contributions by Atheros, kmacy, sephe, thompsa, mlaier, kevlo, and others. Much of the scanning work was supported by Atheros. The 11n work was supported by Marvell.
# 170375	06-Jun-2007	sam	update copyrights to 2007 and convert to be 2-clause bsd-only
# 167252	05-Mar-2007	sam	Change mtx's to use the formulated name as type so witness does not complain on nested tx q lock acquisitions when processing the cab q. MFC after: 2 weeks
# 166954	24-Feb-2007	sam	set the antenna switch when fixing the tx antenna using the dev.ath.X.txantenna sysctl; this is typically what folks want but beware this has the side effect of disabling rx diversity MFC after: 2 weeks
# 166016	15-Jan-2007	sam	add compat shim for ath_hal_isgsmsku until the new hal gets committed
# 166013	14-Jan-2007	sam	Add initial support for 900MHz cards like the Ubiquiti SR9: o eliminate assumptions that half/quarter rate channels on exist in 11a o handle frequency mapping between hal and net80211; hal gives us freq's in the range 2422..2437 that we remap MFC after: 1 month
# 165571	27-Dec-2006	sam	Add half/quarter rate 11a channel support: o change handling of regdomain-related mib knobs so they can be set post-attach: regdomain, countrycode, outdoor, and xchanmode; the hal will not permit changing the regdomain but we expose it for now o on regdomain/countrycode change recalculate the channel list and push it to the net80211 layer (NB: looks to need more tweaking) o setup rate tables for half/quarter rate channels o honor half/quarter rate channel configs when changing channels o honor half/quarter rate channel configs when setting the slot time o use hack/nonstandard channel numbering scheme for the public safety band to avoid overlapping 2.4G channels on dual-band cards o remove setup of ic_sup_rates; the net80211 layer can do this for us and it simplifies handling of half/quarter rate channels Tested only in Public Safety Band with cards that have RF5112.
# 165185	13-Dec-2006	sam	Track v0.9.20.3 hal: o no more ds_vdata in tx/rx descriptors o split h/w tx/rx descriptor from s/w status o as part of the descriptor split change the rate control module api so the ath_buf is passed in to the module so it can fetch both descriptor and status information as needed o add some const poisoning Also for sample rate control algorithm: o split debug msgs (node, rate, any) o uniformly bounds check rate indices (and in some cases correct checks) o move array index ops to after bounds checking o use final tsi from the status block instead of the h/w descriptor o replace h/w descriptor struct's with proper mask+shift defs (this doesn't belong here; everything is known by the driver and should just be sent down so there's no h/w-specific knowledge) MFC after: 1 month
# 163187	09-Oct-2006	sam	correct diag request to fetch isr state on fatal interrupts MFC after: 1 week
# 162410	18-Sep-2006	sam	Add support for newer parts that do not require separate keycache entries for tx+rx mic keys. This requires a newer hal, but works fine with the current hal in cvs. MFC after: 2 weeks
# 162409	18-Sep-2006	sam	remove stub radar support; it's never been used and future hal's will not include the calls (due to redesign) MFC after: 1 week
# 161425	17-Aug-2006	imp	while (0); -> while (0) in multi-line macros
# 159938	26-Jun-2006	sam	Close race in handling mcast traffic when operating as an ap with stations in power save: add a new q where mcast frames are stashed and on beacon update (at DTIM) move frames from the mcast q to the cabq and start it. This ensures the cabq is only manipulated in one place. Sponsored by: Hobnob MFC after: 2 weeks
# 159290	05-Jun-2006	sam	move hal bus+tag externalization to the bus glue code where it belongs; this is a noop on all current freebsd architectures MFC after: 1 month
# 158298	05-May-2006	sam	correct type MFC after: 2 weeks
# 156073	27-Feb-2006	sam	backout 1.136 until we can resolve report that it causes output to stall
# 155991	24-Feb-2006	sam	fix a race whereby a tx descriptor might get reused before the hardware is finished with it; this may only occur when the tx queue is setup as dba-gated but since the fix is cheap apply it to all queues while here make the queue depth signed for use in assertions Reviewed by: apatti MFC after: 2 weeks
# 155732	15-Feb-2006	sam	o handle fatal errors directly instead of via the task queue o temporarily dump some h/w state for diagnosis; this will be removed once some issues are resolved MFC after: 2 weeks
# 155515	10-Feb-2006	sam	Update for rev 0.9.16.16 hal: o add dfs+radar hooks; DFS is presently disabled in the hal o channel and mode handling changes o various api changes o be more aggressive about iq calibration settling so ap mode operation is better immediately after startup o rfkill/rfsilent sysctl support o tpc ack/cts sysctl support MFC after: 2 weeks
# 155496	09-Feb-2006	sam	Beacon timer setup fixes: o pull nexttbtt forward in adhoc mode too o resync beacon timers on joining a bss or ibss as the tstamp we collected while scanning is almost certainly out of date Note we may need to refine the ibss mode check in ath_recv_mgmt. Reviewed by: avatar, dyoung Obtained from: atheros MFC after: 2 weeks
# 155492	09-Feb-2006	sam	Phantom beacon miss workaround: track the tsf of the last received frame and if we get a beacon miss interrupt ignore it if we've received a frame within the beacon miss interval. This should never trigger and the handling at the net80211 layer should likewise deal with this but it doesn't hurt and can suppress extranous probe request frames. Note that we can legtimately get a bmiss when under heavy load. MFC after: 2 weeks
# 155491	09-Feb-2006	sam	use a private task queue thread MFC after: 2 weeks
# 155490	09-Feb-2006	sam	add adhoc demo mode support MFC after: 2 weeks
# 155489	09-Feb-2006	sam	make regdomain sysctl r/w in case it's possible to do this in the future MFC after: 2 weeks
# 155486	09-Feb-2006	sam	add tx99 hooks MFC after: 2 weeks
# 155485	09-Feb-2006	sam	move hal statistics to softc; the per-node stats are overkill, they're only used when operating in station mode MFC after: 2 weeks
# 155483	09-Feb-2006	sam	honor net80211 mcast tx rate MFC after: 2 weeks
# 155482	09-Feb-2006	sam	craft unique names for tx q + buffer mtx's to help with interpreting ktr data MFC after: 2 weeks
# 155481	09-Feb-2006	sam	allow the size of tx+rx buffer pools to be tuned MFC after: 2 weeks
# 155480	09-Feb-2006	sam	lower try count on mgt (and ctl) frames to avoid clogging the tx queue and loading the bss when operating in ap mode under load; adjust recognition of multi-rate retry to match MFC after: 2 weeks
# 155477	09-Feb-2006	sam	move mgt frame tx rate responsibility from the rate control modules to the driver; this avoids redundant logic and will be necessary for future additions MFC after: 2 weeks
# 154140	09-Jan-2006	sam	Update monitoring support: o record tsf in tx+rx frames o switch from raw rssi to dbm for signal data and record both signal and noise floor data (hacked for now to assume a fixed noise floor; is correct with new hal) o add monpass sysctl to control which rx'd frames are passed up with errors; especially useful to see frames with CRC errors o mark 'd packets w/ a CRC error with radiotap's BADFCS flag Also add placeholder code for calibrating the noise floor when using newer hals. Reviewed by: avatar MFC after: 1 week
# 152448	15-Nov-2005	sam	nuke special handling to extend cts when bursting; it was race prone MFC after: 7 days
# 148863	08-Aug-2005	sam	Split crypto tx+rx key indices and add a key index -> node mapping table: Crypto changes: o change driver/net80211 key_alloc api to return tx+rx key indices; a driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set it to be the same as the tx key index (the former disables use of the key index in building the keyix->node mapping table and is the default setup for naive drivers by null_key_alloc) o add cs_max_keyid to crypto state to specify the max h/w key index a driver will return; this is used to allocate the key index mapping table and to bounds check table loookups o while here introduce ieee80211_keyix (finally) for the type of a h/w key index o change crypto notifiers for rx failures to pass the rx key index up as appropriate (michael failure, replay, etc.) Node table changes: o optionally allocate a h/w key index to node mapping table for the station table using the max key index setting supplied by drivers (note the scan table does not get a map) o defer node table allocation to lateattach so the driver has a chance to set the max key id to size the key index map o while here also defer the aid bitmap allocation o add new ieee80211_find_rxnode_withkey api to find a sta/node entry on frame receive with an optional h/w key index to use in checking mapping table; also updates the map if it does a hash lookup and the found node has a rx key index set in the unicast key; note this work is separated from the old ieee80211_find_rxnode call so drivers do not need to be aware of the new mechanism o move some node table manipulation under the node table lock to close a race on node delete o add ieee80211_node_delucastkey to do the dirty work of deleting unicast key state for a node (deletes any key and handles key map references) Ath driver: o nuke private sc_keyixmap mechansim in favor of net80211 support o update key alloc api These changes close several race conditions for the ath driver operating in ap mode. Other drivers should see no change. Station mode operation for ath no longer uses the key index map but performance tests show no noticeable change and this will be fixed when the scan table is eliminated with the new scanning support. Tested by: Michal Mertl, avatar, others Reviewed by: avatar, others MFC after: 2 weeks
# 148362	24-Jul-2005	sam	o fix setup of sc_diversity; the hal does not give us reliable status after attach, only after a reset o when setting diversity via the sysctl don't update sc_diversity until we know the hal requested worked o while here eliminate sc_hasdiversity and sc_hastpc; just query the hal each time since these are the only places we need to know MFC after: 3 days
# 147803	06-Jul-2005	sam	only invoke ath_rate_tx_complete to update rate control state when the frame being sent is to be ack'd and hasn't been filtered by the h/w; this insures we don't pass in tx descriptors that have no meaningful state (e.g. mcast/bcast frames are not acked and so have no tx retry counts) Approved by: re (scottl) Obtained from: Atheros
# 147256	10-Jun-2005	brooks	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam
# 147067	06-Jun-2005	sam	Set the correct IFS parameters for the beacon tx queue when operating in ap and adhoc modes.
# 147057	06-Jun-2005	sam	Misc keycache changes: o purge ath_initkeytable; it's not needed o add multicast key search support for supporting multiple group keys (disabled for now; requires updated hal) o create keycache entry for stations using open auth so they get h/w antenna management support o add keycache -> node mapping table; eliminates mac-based lookup in the net80211 layer
# 144961	12-Apr-2005	sam	honor new IEEE80211_KEY_GROUP key flag Reviewed by: Tai-hwa Liang
# 144617	04-Apr-2005	sam	use frame type returned by ieee80211_input to drive softled code instead of monitoring the input packet count
# 144346	30-Mar-2005	sam	o extend cts to cover packet burst when operating in 11g w/ protection o check current channel parameters, not shadow state, for acm policy on data frames
# 140761	24-Jan-2005	sam	Fixup radiotap handling of FCS and QoS frames per discussion with David Young: o mark rx frames including FCS in the payload with the IEEE80211_RADIOTAP_F_FCS flag o remove hack to copy 802.11 headers with padding out of line; instead mark the frames with IEEE80211_RADIOTAP_F_DATAPAD and require applications to do the work o split precalculated radiotap flags into tx+rx now that they can be different Note the full usefulness of these changes depends on updates to applications that process radiotap data.
# 140438	18-Jan-2005	sam	adjust tx buffer allocation based on empirical testing: o increase the max per-frame tx descriptor count and the number of tx buffers for forthcoming fast frame support o correct the max scatter/gather count; it cannot be larger than the max(tx,rx,beacon) descriptor counts
# 140432	18-Jan-2005	sam	better led blinking
# 139530	31-Dec-2004	sam	bump copyright for 2005
# 139500	31-Dec-2004	sam	Radiotap fixups: o catch one place where we were not using ath_chan_change to switch channels; this fixes a problem where the channel settings were not being correctly reported in captured packets o return unique channel identification in the channel flags; ethereal gets confused if you return merged flags (e.g. ofdm, cck, and 2Ghz) (this is workaround and should be removed if we can ever cleanup radiotap consumers) o correct short/long preamble flag state for rx and treat tx the same--use a new hwflags array that gives us the data based on the h/w rate index/cookie o add gross hack to handle radiotap capture of frames that come in with hardware padding; should be replaced by a flag in the radiotap header and more smarts in the apps that decode radiotap data
# 138570	08-Dec-2004	sam	Update with last year of work.
# 127784	03-Apr-2004	sam	do proper subclassing of node free+copy; the previous hack falls apart when the 802.11 layer does useful work Obtained from: madwifi
# 127781	02-Apr-2004	sam	transmit beacon frames directly instead of defering them to a swi; there was too much delay Obtained from: madwifi
# 127780	02-Apr-2004	sam	update copyright notice for 2004
# 127698	31-Mar-2004	sam	radiotap updates: o force little-endian byte order for header o pad header to 32-bit boundary to guard against applications that assume packet data alignment
# 123044	28-Nov-2003	sam	o track API change for HAL v0.9.6.1 o fix race condition when processing rx descriptors: because we use a self-linked descriptor at the end of the rx descriptor list to avoid rx overruns (which can easily happen for 5212 parts that enable PHY errors) we must carefully check that a descriptor is "done" by looking ahead to the next descriptor before believing the done bit in the current descriptor (this is all handled in the HAL since the rx descriptor format is chip-specific so we need to pass in two additional parameters--the physical address of the current descriptor and the virtual address of the next descriptor in the list) o check copyout return status for SIOCGATHSTATS ioctl Approved by: re (scottl)
# 121100	14-Oct-2003	sam	o convert mutex calls to #defines for portability, etc. o destroy mutex's on detach (was missing)
# 120105	15-Sep-2003	sam	Maintain a history of data associated with received frames and use this to calculate smoothed signal quality data for each node. o add a 16-deep history buffer to each driver-private node storage that holds rssi and antenna info for received frames o override the default per-node "get rssi" method to return an average rssi value based on samples collected over the last second o enable beacon reception so even idle systems maintain a running history of signal quality This data may also be useful for improving the rate control algorithm. Based on work by Tom Marshall <tommy@home.tig-grr.com> for MADWIFI.
# 120071	14-Sep-2003	sam	o mark the device capable of short preamble (meaningless for the 5210 but safe since the 802.11 layer does the right thing for 11a operation) o select short preamble operation based on the negotiated capabilities; not just the local state/capability o fillin the duration field in the 802.11 header as appropriate o remove detection of 11g support; no longer needed Obtained from: MADWIFI (with modifications)
# 119783	05-Sep-2003	sam	Add support for the experimental radiotap capture format. With this we no longer need the debugging code to dump packets.
# 119150	19-Aug-2003	sam	MFp4 changes to fix locking issues and correct reference count handling of station entries in hostap mode: Input path: o driver is now expected to find the node associated with the sender of a received frame; use ic_bss if none is located o driver passes the (referenced) node into ieee80211_input for use within the wlan module and is responsible for cleaning up on return o the antenna state is no longer passed up with each frame; this is now considered driver-private state and drivers are responsible for keeping it in the driver-private part of a node Output path: Revamp output path for management frames to eliminate redundant locking that causes problems and to correct reference counting bogosity that occurs when stations are timed out due to inactivity (in AP mode). On output the refcnt'd node is stashed in the pkthdr's recvif field (yech) and retrieved by the driver. This eliminates an unref/ref scenario and related node table unlock/lock due to the driver looking up the node. This is particularly important when stations are timed out as this causes a lock order reversal that can result in a deadlock. As a byproduct we also reduce the overhead for sending management frames (minimal). Additional fallout from this is a change to ieee80211_encap to return a refcn't node for tieing to the outbound frame. Node refcnts are not reclaimed until after a frame is completely processed (e.g. in the tx interrupt handler). This is especially important for timed out stations as this deref will be the final one causing the node entry to be reclaimed. Additional semi-related changes: o replace m_copym use with m_copypacket (optimization) o add assert to verify ic_bss is never free'd during normal operation o add comments explaining calling conventions by drivers for frames going in each direction o remove extraneous code that "cannot be executed" (e.g. because pointers may never be null)
# 119144	19-Aug-2003	sam	maintain a table for mapping hardware rate codes to 802.11 rates for calculating the rate for each rx'd frame
# 117812	20-Jul-2003	sam	track changes to 802.11 code: o override new_state method per new model o use ieee80211_state_name instead of private copy
# 117516	13-Jul-2003	sam	o add read-only sysctls to view regulatory domain, country code, and outdoor use controls o use sysctl-visible values in setting up channel list
# 116743	23-Jun-2003	sam	Atheros 802.11 driver. Requires Atheros Hardware Access Lay (HAL). Supported by: Atheros Comunications