History log of /freebsd-9.3-release/sys/dev/cxgbe/tom/t4_tom.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 265482 07-May-2014 np

MFC r255198, r255410, and r255411.

r255198:
For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is
set to 15 to indicate that the peer did not send a window scale option
with its SYN. Do not send a window scale option in the SYN|ACK reply
in that case.

r255410:
Fix a miscalculation that caused cxgbe/tom to auto-increment
a TOE socket's tx buffer size too aggressively.

r255411:
Rework the tx credit mechanism between the cxgbe/tom driver
and the card. This helps smooth out some burstiness in the
exchange.


# 252814 05-Jul-2013 np

- MFC r252661, r252705, r252711, r252715, r252716, r252724, r252728,
r252747.
- Connect t3_tom and t4_tom to the build (r252555 enables them).

r252661:
- Include the T5 firmware with the driver.
- Update the T4 firmware to the latest.
- Minor reorganization and updates to the version macros, etc.

r252705:
- Read all TP parameters in one place.
- Read the filter mode, calculate various shifts, and use them
properly during active open (in select_ntuple).

r252711:
The T5 allows the driver to specify the ISS. Do so; use the ISS picked
by the kernel.

r252715:
Ring the egress queue's doorbell as soon as there are 8 or more
descriptors ready to be processed.

r252716:
Pay attention to TCP_NODELAY when it's set/unset after the connection
is established.

r252724:
On-the-fly changes to the interrupt coalescing timer should apply to the
TOE rx queues too.

r252728:
- Make note of interface MTU change if the rx queues exist, and not just
when the interface is up.
- Add a tunable to control the TOE's rx coalesce feature (enabled by
default as it always has been). Consider the interface MTU or the
coalesce size when deciding which cluster zone to use to fill the
offload rx queue's free list. The tunable is:
dev.{t4nex,t5nex}.<N>.toe.rx_coalesce

r252747:
- Show the reason why link is down if this information is available.
- Display the temperature and PHY firmware version of the BT PHY.


# 252495 02-Jul-2013 np

MFC all cxgbe(4) changes missing from stable/9:
r248925, r249368, r249370, r249376, r249382, r249383, r249385, r249391,
r249392, r249393, r249627, r249629, r250090, r250092, r250093, r250117,
r250218, r250221, r250614, r251213, r251317, r251358, r251434, r251518,
r251638, r252312, r252469, r252470, r250697(kib).

r248925:
Support for Chelsio's 40G Terminator 5 (aka T5) ASIC.
...

r249368:
Set and display the IP fragment bit correctly when dealing with
the filter mode.

r249370:
cxgbe(4): Ensure that the MOD_LOAD handler runs before either t4nex or
t5nex attach to their devices.

r249376:
- Explain clearly why a different firmware is being installed (if/when
it is being installed). Improve other error messages while here.

- Select special FPGA specific configuration profile when appropriate.

r249382:
There is no need for elaborate queries and error checking when trying to
set FW4MSG_ENCAP.

r249383:
Get rid of a couple of stray \n's.

r249385:
cxgbe/tom: Slight simplification of code that calculates options2.

r249391:
Auto-reduce the holdoff timers that are greater than the maximum value
allowed by the hardware.

r249392:
Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).

r249393:
Add pciids of the T5 based cards. The ones that I haven't tested with
cxgbe(4) are disabled for now. This will change.

r249627:
cxgbe/tom: Update the CLIP table on the chip when there are changes
to the list of IPv6 addresses on the system. The table is used for
TOE+IPv6 only.

r249629:
cxgbe(4): Refuse to install T5 firmwares on a T4 card (and vice versa).

r250090:
cxgbe(4): Some updates to shared code.

r250092:
- Provide accurate ifmedia information so that 40G ports/transceivers are
displayed properly in ifconfig, etc.

- Use the same number of tx and rx queues for a 40G port as for a 10G port.

r250093:
Attach to the T580 (2 x 40G) card.

r250117:
Fix DDP breakage introduced in r248925. Bitwise OR has higher
precedence than ternary conditional.

r250218:
cxgbe/tom: Do not use M_PROTO1 to mark rx zero-copy mbufs as special.
All the M_PROTOn flags are clobbered when an mbuf is appended to the
socket buffer.

r250221:
cxgbe: Switch to a better way to install firmware.

r250614:
Deal correctly with 40G ports that don't have any transceiver plugged
in. Do not claim that they have unknown tranceivers.

r251213:
cxgbe(4): Some more debug sysctls. These work on both T4 and T5 based
cards.

r251317:
cxgbe(4): t4fw_cfg must be explicitly loaded if the driver is being
loaded via loader.conf.

r251358:
cxgbe(4): Provide accurate hit count for filters on T5 cards. The
location within the TCB and the size have both changed.

r251434:
cxgbe(4): Never install a firmware if hw.cxgbe.fw_install is 0.

r251518:
cxgbe/tom: Fix bad signed/unsigned mixup in the stid allocator. This
fixes a panic when allocating a mixture of IPv6 and IPv4 stids.

r251638:
cxgbe/tom: Allow caller to select the queue (control or data) used to
send the CPL_SET_TCB_FIELD request in t4_set_tcb_field().

r252312:
Update T5 register ranges. This is so that regdump skips over registers
with read side-effects.

r252469:
Add a sysctl to get the number of filters available.

sysctl dev.t4nex.<N>.nfilters
sysctl dev.t5nex.<N>.nfilters

r252470:
Count the number of hits for a filter by default.

r250697:
Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.


# 247434 27-Feb-2013 np

MFC r245243, r245274, r245276, r245434, r245441, r245448, r245467,
r245468, r245517, r245518, r245520, r245567, r245933, r245935, r245936,
r245937, r246093, r246385, r246575, r247062, r247122, r247289, r247291,
r247347, r247355, and r241733.

Note that TCP_OFFLOAD is not enabled in 9 yet and so some of these MFCs
don't really affect functionality. But they do help future MFCs
(related to TCP_OFFLOAD or not) by minimizing diffs with the driver in
head.

r245243:
cxgbe(4): updates to the configuration file that controls how hardware
resources are partitioned.

- Reduce the number of virtual interfaces reserved for PF4. This leaves
spare room in the source MAC table and allows the driver to setup
filters that rewrite the source MAC address.

- Reduce the number of filters and use the freed up space for the CLIP
(Compressed Local IPv6 addresses) table. This is a prerequisite for
IPv6 TOE support which will follow separately in a series of commits.

r245274:
cxgbe(4): Add functions to help synchronize "slow" operations (those not
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.

Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete). This guarantees
that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
power of 2.

r245276:
Overhaul the stid allocator so that it can be used for IPv6 servers
too. The entry for an IPv6 server in the TCAM takes up the equivalent
of two ordinary stids and must be properly aligned too.

r245434:
cxgbe(4): Updates to the hardware L2 table management code.

- Add full support for IPv6 addresses.

- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.

- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.

r245441:
cxgbe/tom: Miscellaneous updates for TOE+IPv6 support (more to follow).

- Teach find_best_mtu_idx() to deal with IPv6 endpoints.

- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is
enabled.

- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included
without having to drag in t4_msg.h too. This was bothering the iWARP
driver for some reason.

r245448:
cxgbe/tom: Basic CLIP table management.

This is the Compressed Local IPv6 table on the chip. To save space, the
chip uses an index into this table instead of a full IPv6 address in
some of its hardware data structures.

For now the driver fills this table with all the local IPv6 addresses
that it sees at the time the table is initialized. I'll improve this
later so that the table is updated whenever new IPv6 addresses are
configured or existing ones deleted.

r245467:
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (active open).

r245468:
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (passive open).

r245517:
cxgbe: Fix the for_each_foo macros -- the last argument should not share
its name with any member of struct sge.

r245518:
cxgbe: Do a more thorough job in the CLEAR_STATS ioctl.

r245520:
Allow "ivlan" (inner VLAN) to be used as an alias for "vlan" when
specifying match criteria. "vlan" continues to be valid here, and it
continues to be valid when deleting, rewriting, inserting, or stacking
an 802.1q tag to a matching packet.

r245567:
cxgbe: Make the for_each macros safer to use by turning them
into a single statement each.

r245933:
cxgbe/tom: List IFCAP_TOE6 as supported now that all the required pieces
are in place. You still have to enable it explicitly, after loading the
t4_tom KLD.

r245935:
Add a couple of missing error codes. Treat CPL_ERR_KEEPALV_NEG_ADVICE as
negative advice and not a fatal error.

r245936:
Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.

r245937:
Install an extra hold on the newly allocated synq entry so that it
cannot be freed while do_pass_accept_req is running. This closes a race
where do_pass_establish on another CPU (the driver chose a different
queue for the new tid) expands the synq entry into a full PCB and then
releases the only hold on it, all while do_pass_accept_req is still
running.

r246093:
Provide a statistic to track the number of drops in each of the port's
txq's buf_ring. The aggregate for all the queues of a port is already
provided in ifnet->if_snd.ifq_drops.

r246385:
Busy-wait when cold.

r246575:
Do not hold locks around hardware context reads.

r247062:
cxgbe(4): Assume that CSUM_TSO in the transmit path implies CSUM_IP and
CSUM_TCP too. They are all set explicitly by the kernel usually.

r247122:
cxgbe(4): Add sysctls to extract debug information from the chip:

dev.t4nex.X.misc.cim_la logic analyzer dump
dev.t4nex.X.misc.cim_qcfg queue configuration
dev.t4nex.X.misc.cim_ibq_xxx inbound queues
dev.t4nex.X.misc.cim_obq_xxx outbound queues

r247289:
cxgbe(4): Update firmware to 1.8.4.0.

r247291:
cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating
them in a firmware message if it is able to do so. This works out
better for one of the FIFOs in the chip.

r247347:
cxgbe(4): Consider all the API versions of the interfaces exported by
the firmware (instead of just the main firmware version) when evaluating
firmware compatibility. Document the new "hw.cxgbe.fw_install" knob
being introduced here.

This should fix kern/173584 too. Setting hw.cxgbe.fw_install=2 will
mostly do what was requested in the PR but it's a bit more intelligent
in that it won't reinstall the same firmware repeatedly if the knob is
left set.

r247355:
cxgbe(4): Report unusual out of band errors from the firmware.

r241733 (by ed@):
Prefer __containerof() over __member2struct().

The former works better with qualifiers, but also properly type checks
the input pointer.


# 240169 06-Sep-2012 np

MFC many cxgb and cxgbe features and fixes (r239258, r239259, r239264,
r239266, r239336, r239338, r239339, r239341, r239344, r239514, r239527,
r239528, r239544.

r239258:
Convert some fixed parameters to tunables (with reasonable default
values).

- cong_drop specifies what to do on congestion: nothing, backpressure,
or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.

r239259:
if_iqdrops should include frames truncated within the chip.

r239264:
Assume INET, INET6, and TCP_OFFLOAD when the driver is built out of tree and
KERNBUILDDIR is not set.

r239266:
The size of the buffers in an Ethernet freelist has to be higher than the
interface's MTU. Initialize such freelists with correct values.

This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048
and 9216 in size) happened to have enough spare room. I ran into it when
playing around with unusual MTUs.

r239336:
Allow for a different handler for each type of firmware message.

r239338:
Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB. Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or
something else and route it appropriately.

r239339:
Make room for DDP page pods in the default configuration profile. While
here, bump up the L2 table's size to 4K entries.

r239341:
Initialize various DDP parameters in the main cxgbe(4) driver:

- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.

r239344:
Support for TCP DDP (Direct Data Placement) in the T4 TOE module.

Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.

- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.

r239514:
Minor cleanup: use bitwise ops instead of pointless wrappers around
setbit/clrbit.

r239527:
Cannot hold a mutex around vm_fault_quick_hold_pages, so don't. Tweak
some comments while here.

r239528:
Avoid a NULL pointer dereference.

r239544:
Deal with the case where a syncache entry added by the TOE driver is
evicted from the syncache but a later syncache_expand succeeds because
of syncookies. The TOE driver has to resort to more direct means to
install its hooks in the socket in this case.


# 237920 01-Jul-2012 np

Backport just the sys/{dev,modules}/cxgb{,e}/ parts of r237263, and then
disable the TOE and iWARP modules in the Makefiles (they won't compile
without the rest of r237263).

This reduces diffs between the cxgb/cxgbe drivers in head and 9 and
makes it easy to MFC other fixes to 9.


# 237263 19-Jun-2012 np

- Updated TOE support in the kernel.

- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)