History log of /freebsd-9.3-release/sys/dev/cxgb/
Revision Date Author Comments
267654 20-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


259993 28-Dec-2013 dim

MFC r259897:

In sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, remove static functions
mk_cpl_barrier_ulp(), mk_get_tcb_ulp() and mk_set_tcb_field_ulp(), which
are all unused since r237263.


259992 28-Dec-2013 dim

MFC r259896:

In sys/dev/cxgb/common/cxgb_mc5.c, remove static function
dbgi_wr_addr3(), which is unused since r167514.


252555 03-Jul-2013 np

MFC/backport core kernel and userspace parts of r237263 (TCP_OFFLOAD
rework). MFC r237563, r239511, r243603, r245915, r245916, r245919,
r245921, r245922, r245924, r245925, r245932, r245934 too.

Build tested with make universe.

r237263:
- Updated TOE support in the kernel.
...

r237563:
Fix clang warning when compiling iw_cxgb.

r239511:
Correctly handle the case where an inp has already been dropped by the time
the TOE driver reports that an active open failed. toe_connect_failed is
supposed to handle this but it should be provided the inpcb instead of the
tcpcb which may no longer be around.

r243603:
Make sure that tcp_timer_activate() correctly sees TCP_OFFLOAD (or not).

r245915:
Heed SO_NO_OFFLOAD.

r245916:
Teach toe_4tuple_check() to deal with IPv6 4-tuples too.

r245919:
Add TCP_OFFLOAD hook in syncache_respond for IPv6 too, just like the one
that exists for IPv4.

r245921:
There is no need to call into the TOE driver twice in pru_rcvd (tod_rcvd
and then tod_output right after that).

r245922:
Avoid NULL dereference in nd6_storelladdr when no mbuf is provided. It
is called this way from a couple of places in the OFED code. (toecore
calls it too but that's going to change shortly).

r245924:
Move lle_event to if_llatbl.h

lle_event replaced arp_update_event after the ARP rewrite and ended up
in if_ether.h simply because arp_update_event used to be there too.
IPv6 neighbor discovery is going to grow lle_event support and this is a
good time to move it to if_llatbl.h.

The two in-tree consumers of this event - OFED and toecore - are not
affected.

r245925:
Generate lle_event in the IPv6 neighbor discovery code too.

r245932:
Teach toe_l2_resolve to resolve IPv6 destinations too.

r245934:
Add checks for SO_NO_OFFLOAD in a couple of places that I missed earlier
in r245915.


252495 02-Jul-2013 np

MFC all cxgbe(4) changes missing from stable/9:
r248925, r249368, r249370, r249376, r249382, r249383, r249385, r249391,
r249392, r249393, r249627, r249629, r250090, r250092, r250093, r250117,
r250218, r250221, r250614, r251213, r251317, r251358, r251434, r251518,
r251638, r252312, r252469, r252470, r250697(kib).

r248925:
Support for Chelsio's 40G Terminator 5 (aka T5) ASIC.
...

r249368:
Set and display the IP fragment bit correctly when dealing with
the filter mode.

r249370:
cxgbe(4): Ensure that the MOD_LOAD handler runs before either t4nex or
t5nex attach to their devices.

r249376:
- Explain clearly why a different firmware is being installed (if/when
it is being installed). Improve other error messages while here.

- Select special FPGA specific configuration profile when appropriate.

r249382:
There is no need for elaborate queries and error checking when trying to
set FW4MSG_ENCAP.

r249383:
Get rid of a couple of stray \n's.

r249385:
cxgbe/tom: Slight simplification of code that calculates options2.

r249391:
Auto-reduce the holdoff timers that are greater than the maximum value
allowed by the hardware.

r249392:
Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).

r249393:
Add pciids of the T5 based cards. The ones that I haven't tested with
cxgbe(4) are disabled for now. This will change.

r249627:
cxgbe/tom: Update the CLIP table on the chip when there are changes
to the list of IPv6 addresses on the system. The table is used for
TOE+IPv6 only.

r249629:
cxgbe(4): Refuse to install T5 firmwares on a T4 card (and vice versa).

r250090:
cxgbe(4): Some updates to shared code.

r250092:
- Provide accurate ifmedia information so that 40G ports/transceivers are
displayed properly in ifconfig, etc.

- Use the same number of tx and rx queues for a 40G port as for a 10G port.

r250093:
Attach to the T580 (2 x 40G) card.

r250117:
Fix DDP breakage introduced in r248925. Bitwise OR has higher
precedence than ternary conditional.

r250218:
cxgbe/tom: Do not use M_PROTO1 to mark rx zero-copy mbufs as special.
All the M_PROTOn flags are clobbered when an mbuf is appended to the
socket buffer.

r250221:
cxgbe: Switch to a better way to install firmware.

r250614:
Deal correctly with 40G ports that don't have any transceiver plugged
in. Do not claim that they have unknown tranceivers.

r251213:
cxgbe(4): Some more debug sysctls. These work on both T4 and T5 based
cards.

r251317:
cxgbe(4): t4fw_cfg must be explicitly loaded if the driver is being
loaded via loader.conf.

r251358:
cxgbe(4): Provide accurate hit count for filters on T5 cards. The
location within the TCB and the size have both changed.

r251434:
cxgbe(4): Never install a firmware if hw.cxgbe.fw_install is 0.

r251518:
cxgbe/tom: Fix bad signed/unsigned mixup in the stid allocator. This
fixes a panic when allocating a mixture of IPv6 and IPv4 stids.

r251638:
cxgbe/tom: Allow caller to select the queue (control or data) used to
send the CPL_SET_TCB_FIELD request in t4_set_tcb_field().

r252312:
Update T5 register ranges. This is so that regdump skips over registers
with read side-effects.

r252469:
Add a sysctl to get the number of filters available.

sysctl dev.t4nex.<N>.nfilters
sysctl dev.t5nex.<N>.nfilters

r252470:
Count the number of hits for a filter by default.

r250697:
Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.


248078 09-Mar-2013 marius

MFC: r243857 (partial)

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.


247434 28-Feb-2013 np

MFC r245243, r245274, r245276, r245434, r245441, r245448, r245467,
r245468, r245517, r245518, r245520, r245567, r245933, r245935, r245936,
r245937, r246093, r246385, r246575, r247062, r247122, r247289, r247291,
r247347, r247355, and r241733.

Note that TCP_OFFLOAD is not enabled in 9 yet and so some of these MFCs
don't really affect functionality. But they do help future MFCs
(related to TCP_OFFLOAD or not) by minimizing diffs with the driver in
head.

r245243:
cxgbe(4): updates to the configuration file that controls how hardware
resources are partitioned.

- Reduce the number of virtual interfaces reserved for PF4. This leaves
spare room in the source MAC table and allows the driver to setup
filters that rewrite the source MAC address.

- Reduce the number of filters and use the freed up space for the CLIP
(Compressed Local IPv6 addresses) table. This is a prerequisite for
IPv6 TOE support which will follow separately in a series of commits.

r245274:
cxgbe(4): Add functions to help synchronize "slow" operations (those not
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.

Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete). This guarantees
that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
power of 2.

r245276:
Overhaul the stid allocator so that it can be used for IPv6 servers
too. The entry for an IPv6 server in the TCAM takes up the equivalent
of two ordinary stids and must be properly aligned too.

r245434:
cxgbe(4): Updates to the hardware L2 table management code.

- Add full support for IPv6 addresses.

- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.

- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.

r245441:
cxgbe/tom: Miscellaneous updates for TOE+IPv6 support (more to follow).

- Teach find_best_mtu_idx() to deal with IPv6 endpoints.

- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is
enabled.

- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included
without having to drag in t4_msg.h too. This was bothering the iWARP
driver for some reason.

r245448:
cxgbe/tom: Basic CLIP table management.

This is the Compressed Local IPv6 table on the chip. To save space, the
chip uses an index into this table instead of a full IPv6 address in
some of its hardware data structures.

For now the driver fills this table with all the local IPv6 addresses
that it sees at the time the table is initialized. I'll improve this
later so that the table is updated whenever new IPv6 addresses are
configured or existing ones deleted.

r245467:
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (active open).

r245468:
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (passive open).

r245517:
cxgbe: Fix the for_each_foo macros -- the last argument should not share
its name with any member of struct sge.

r245518:
cxgbe: Do a more thorough job in the CLEAR_STATS ioctl.

r245520:
Allow "ivlan" (inner VLAN) to be used as an alias for "vlan" when
specifying match criteria. "vlan" continues to be valid here, and it
continues to be valid when deleting, rewriting, inserting, or stacking
an 802.1q tag to a matching packet.

r245567:
cxgbe: Make the for_each macros safer to use by turning them
into a single statement each.

r245933:
cxgbe/tom: List IFCAP_TOE6 as supported now that all the required pieces
are in place. You still have to enable it explicitly, after loading the
t4_tom KLD.

r245935:
Add a couple of missing error codes. Treat CPL_ERR_KEEPALV_NEG_ADVICE as
negative advice and not a fatal error.

r245936:
Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.

r245937:
Install an extra hold on the newly allocated synq entry so that it
cannot be freed while do_pass_accept_req is running. This closes a race
where do_pass_establish on another CPU (the driver chose a different
queue for the new tid) expands the synq entry into a full PCB and then
releases the only hold on it, all while do_pass_accept_req is still
running.

r246093:
Provide a statistic to track the number of drops in each of the port's
txq's buf_ring. The aggregate for all the queues of a port is already
provided in ifnet->if_snd.ifq_drops.

r246385:
Busy-wait when cold.

r246575:
Do not hold locks around hardware context reads.

r247062:
cxgbe(4): Assume that CSUM_TSO in the transmit path implies CSUM_IP and
CSUM_TCP too. They are all set explicitly by the kernel usually.

r247122:
cxgbe(4): Add sysctls to extract debug information from the chip:

dev.t4nex.X.misc.cim_la logic analyzer dump
dev.t4nex.X.misc.cim_qcfg queue configuration
dev.t4nex.X.misc.cim_ibq_xxx inbound queues
dev.t4nex.X.misc.cim_obq_xxx outbound queues

r247289:
cxgbe(4): Update firmware to 1.8.4.0.

r247291:
cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating
them in a firmware message if it is able to do so. This works out
better for one of the FIFOs in the chip.

r247347:
cxgbe(4): Consider all the API versions of the interfaces exported by
the firmware (instead of just the main firmware version) when evaluating
firmware compatibility. Document the new "hw.cxgbe.fw_install" knob
being introduced here.

This should fix kern/173584 too. Setting hw.cxgbe.fw_install=2 will
mostly do what was requested in the PR but it's a bit more intelligent
in that it won't reinstall the same firmware repeatedly if the knob is
left set.

r247355:
cxgbe(4): Report unusual out of band errors from the firmware.

r241733 (by ed@):
Prefer __containerof() over __member2struct().

The former works better with qualifiers, but also properly type checks
the input pointer.


242544 04-Nov-2012 eadler

MFC r241844:
remove duplicate semicolons where possible.

Approved by: cperciva (implicit)


242369 30-Oct-2012 np

MFC r242087:

Initialize the response queue mutex a bit earlier to avoid a panic that
occurs if t3_sge_alloc_qset fails and then t3_free_qset attempts to
destroy an uninitialized mutex.


242015 24-Oct-2012 gavin

Merge r240680 from head:

Align the PCI Express #defines with the style used for the PCI-X
#defines. This has the advantage that it makes the names more
compact, and also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

In this MFC, #defines have been added for the old names to assist
out-of-tree drivers.


241314 07-Oct-2012 jhb

MFC 239913:
Attach interrupt handlers during attach instead of during the first time
the interface is brought up. Without this, the boot time interrupt
round-robin assignment does not think the allocated interrupt resources
are active and leaves them assigned to CPU 0.

While here, add descriptive tags to each interrupt handler when MSI-X
is used.


240169 06-Sep-2012 np

MFC many cxgb and cxgbe features and fixes (r239258, r239259, r239264,
r239266, r239336, r239338, r239339, r239341, r239344, r239514, r239527,
r239528, r239544.

r239258:
Convert some fixed parameters to tunables (with reasonable default
values).

- cong_drop specifies what to do on congestion: nothing, backpressure,
or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.

r239259:
if_iqdrops should include frames truncated within the chip.

r239264:
Assume INET, INET6, and TCP_OFFLOAD when the driver is built out of tree and
KERNBUILDDIR is not set.

r239266:
The size of the buffers in an Ethernet freelist has to be higher than the
interface's MTU. Initialize such freelists with correct values.

This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048
and 9216 in size) happened to have enough spare room. I ran into it when
playing around with unusual MTUs.

r239336:
Allow for a different handler for each type of firmware message.

r239338:
Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB. Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or
something else and route it appropriately.

r239339:
Make room for DDP page pods in the default configuration profile. While
here, bump up the L2 table's size to 4K entries.

r239341:
Initialize various DDP parameters in the main cxgbe(4) driver:

- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.

r239344:
Support for TCP DDP (Direct Data Placement) in the T4 TOE module.

Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.

- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.

r239514:
Minor cleanup: use bitwise ops instead of pointless wrappers around
setbit/clrbit.

r239527:
Cannot hold a mutex around vm_fault_quick_hold_pages, so don't. Tweak
some comments while here.

r239528:
Avoid a NULL pointer dereference.

r239544:
Deal with the case where a syncache entry added by the TOE driver is
evicted from the syncache but a later syncache_expand succeeds because
of syncookies. The TOE driver has to resort to more direct means to
install its hooks in the socket in this case.


239522 21-Aug-2012 dim

MFC r239101:

In cxgb(4), in function iwch_reregister_phys_mem(), initialize the
'npages' variable to zero, to avoid using it uninitialized in certain
cases.

Found by: clang 3.2
Reviewed by: np


238302 09-Jul-2012 np

Re-enable IFCAP_TSO6 in cxgb(4) and cxgbe(4) in stable/9. The kernel
changes needed for all this to work have now been MFC'd to 9 by bz@.

This is a direct commit to stable/9 that removes earlier changes made to
drivers in this branch only.

Approved by: re (kib)


238230 08-Jul-2012 bz

MFC r235944:

Significantly update tcp_lro for mostly two things:
1) introduce basic support for IPv6 without extension headers.
2) try hard to also get the incremental checksum updates right,
especially also in the IPv4 case for the IP and TCP header.

Move variables around for better locality, factor things out into
functions, allow checksum updates to be compiled out, ...

Leave a few comments on further things to look at in the future,
though that is not the full list.

Update drivers with appropriate #includes as needed for IPv6 data
type in LRO.

Approved by: re


238088 03-Jul-2012 np

Do not enable IFCAP_TSO6 in cxgb(4) and cxgbe(4) in stable/9. The
kernel code in 9 isn't quite ready for TSO6 yet.

This is a direct commit to stable/9. IFCAP_TSO6 works properly in head
and there is no need to disable it over there.

Approved by: re (kib)


237925 01-Jul-2012 np

MFC r237832, r237436, r237439, r237463, r237512, r237587, r237799,
r237819, r237831.

r237832:
cxgb(4): IPv6 rx/tx hw checksum, IPv6 TSO and LRO too.

r237436:
cxgbe(4): update to firmware interface 1.5.2.0; updates to shared code.

r237439:
Do not read registers with read side effects while performing a register
dump for cxgbetool.

r237463:
Do not allocate extra vectors when adapter is not TOE
capable (or toecaps have been disallowed by the user).

r237512:
Better way to determine the status page length and rx pad boundary.

r237587:
Allow cxgbe(4) running within a VM to attach to its devices that have been
exported via PCI passthrough.

r237799:
cxgbe(4): support for IPv6 hardware checksumming (rx and tx).

r237819:
cxgbe(4): support for IPv6 TSO and LRO.

r237831:
- Assign (don't OR) the CSUM_XXX bits to csum_flags in the rx checksum code.
- Fix TSO/TSO4 mixup.
- Add IFCAP_LINKSTATE to the available/enabled capabilities.


237920 01-Jul-2012 np

Backport just the sys/{dev,modules}/cxgb{,e}/ parts of r237263, and then
disable the TOE and iWARP modules in the Makefiles (they won't compile
without the rest of r237263).

This reduces diffs between the cxgb/cxgbe drivers in head and 9 and
makes it easy to MFC other fixes to 9.


237916 01-Jul-2012 np

MFC r231317, r235963 (bz@), r234831, r234833.

r231317
Add IPv6 TSO (including TSO+VLAN) support to cxgb(4).

r235963 (bz@)
Allow LRO to work on IPv6 as well.
Fix the module Makefile to at least properly inlcude opt_inet6.h
and allow builds without INET or INET6.

r234831
Make sure that the firmware version is available in
dev.t4nex.X.firmware_version even if the driver fails to attach
properly. At least it'll be easy to tell what we're dealing with.

r234833:
Change the default to not use packet counters to generate rx interrupts.
Rely solely on the timer based mechanism.

Update man page to reflect this change.


235743 21-May-2012 jhb

Toss bogus mergeinfo.


235738 21-May-2012 sbruno

MFC r235634

Fix and update battery status bits according to linux driver


233024 16-Mar-2012 scottl

MFC 232854,232874,232882,232883,232886 for bus_get_dma_tag()


231604 13-Feb-2012 np

MFC r231175:
Allocate the BAR for userspace doorbells after the is_offload check
is functional.


231597 13-Feb-2012 np

MFC r231116:
Remove if_start from cxgb and cxgbe.


231104 07-Feb-2012 np

MFC r228825:
Fix return value of function.


229093 31-Dec-2011 hselasky

MFC r226173, r227843, r227848 and r227908:
Use DEVMETHOD_END to mark end of device methods.
Remove superfluous device methods.
Add some missing __FBSBID() macros.


225736 23-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


220009 25-Mar-2011 np

Update T3 firmware to 7.11.0

Changes since 7.8.0 (from the official changelog):

- Fixed sporadic interrupt generation for associated CQ when processing
a local invalidate work request
- Changes to core scheduling to avoid starving requests from the host
under heavy RDMA Read Request load (e.g. packets to the wire)

- Programmed the tp tx resource limiter in function of the traffic (only
affects iWarp)

- Increased the egress NIC gather list length from 36 to 46 entries

MFC after: 1 week


219946 24-Mar-2011 np

t3_free_sge_resources should be given the number of qsets it needs to free.

MFC after: 1 week


219945 24-Mar-2011 np

T3C initialization should setup the parity fence too.

MFC after: 1 week


219902 23-Mar-2011 jhb

Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.


218909 21-Feb-2011 brucec

Fix typos - remove duplicate "the".

PR: bin/154928
Submitted by: Eitan Adler <lists at eitanadler.com>
MFC after: 3 days


217916 27-Jan-2011 mdf

Explicitly wire the user buffer rather than doing it implicitly in
sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT
drain for extremely large amounts of data where the caller knows that
appropriate references are held, and sleeping is not an issue.

Inspired by: rwatson


217616 19-Jan-2011 mdf

Introduce signed and unsigned version of CTLTYPE_QUAD, renaming
existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().


217586 19-Jan-2011 mdf

sysctl(8) should use the CTLTYPE to determine the type of data when
reading. (This was already done for writing to a sysctl). This
requires all SYSCTL setups to specify a type. Most of them are now
checked at compile-time.

Remove SYSCTL_*X* sysctl additions as the print being in hex should be
controlled by the -x flag to sysctl(8).

Succested by: bde


217321 12-Jan-2011 mdf

sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.

Commit the cxgb driver piece.


216699 25-Dec-2010 alc

Introduce and use a new VM interface for temporarily pinning pages. This
new interface replaces the combined use of vm_fault_quick() and
pmap_extract_and_hold() throughout the kernel.

In collaboration with: kib@


216607 20-Dec-2010 alc

The local variable "rv" is still required by vm_fault_hold_user_pages().


216604 20-Dec-2010 alc

Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race
condition in proc_rwmem() and to (2) simplify the implementation of the
cxgb driver's vm_fault_hold_user_pages(). Specifically, in proc_rwmem()
the requested read or write could fail because the targeted page could be
reclaimed between the calls to vm_fault() and vm_page_hold().

In collaboration with: kib@
MFC after: 6 weeks


216511 17-Dec-2010 alc

Implement and use a single optimized function for unholding a set of pages.

Reviewed by: kib@


216373 11-Dec-2010 avg

fix incorrect use of atomic_set_xxx in cxgb

There is no need to use an atomic operation at structure initialization
time.
Note that the file changed is not connected to the build at this time.

Reviewed by: jhb (general issue)
Approved by: np
MFC after: 2 weeks


212750 16-Sep-2010 mdf

Re-add r212370 now that the LOR in powerpc64 has been resolved:

Add a drain function for struct sysctl_req, and use it for a variety
of handlers, some of which had to do awkward things to get a large
enough SBUF_FIXEDLEN buffer.

Note that some sysctl handlers were explicitly outputting a trailing
NUL byte. This behaviour was preserved, though it should not be
necessary.

Reviewed by: phk (original patch)


212710 15-Sep-2010 np

Fix t3_gate_rx_traffic and t3_open_rx_traffic. Parts of them always operated
on XGMAC0 instead of the specified XGMAC.

MFC after: 3 days


212572 13-Sep-2010 mdf

Revert r212370, as it causes a LOR on powerpc. powerpc does a few
unexpected things in copyout(9) and so wiring the user buffer is not
sufficient to perform a copyout(9) while holding a random mutex.

Requested by: nwhitehorn


212370 09-Sep-2010 mdf

Add a drain function for struct sysctl_req, and use it for a variety of
handlers, some of which had to do awkward things to get a large enough
FIXEDLEN buffer.

Note that some sysctl handlers were explicitly outputting a trailing NUL
byte. This behaviour was preserved, though it should not be necessary.

Reviewed by: phk


211347 15-Aug-2010 np

Fix tx pause quanta and timer calculations.

MFC after: 3 days


211346 15-Aug-2010 np

Always reset the XGMAC's XAUI PCS on a link up.

MFC after: 3 days


211345 15-Aug-2010 np

wakeup is required if the adapter lock is released anywhere during
init and not just for the may_sleep case.

Pointed out by: Isilon
MFC after: 3 days


210505 26-Jul-2010 jhb

- Change the warning about PCI-e links narrower than x8 to only apply to
10G cards. 1G cards are x4 only.
- Use constants from pcireg.h for reading the current link width.
- Use pci_set_max_read_req() rather than implementing it by hand.

Reviewed by: np
MFC after: 1 week


209841 09-Jul-2010 np

Improve cxgb(4)'s behaviour when faced with temporarily "bouncy" links:
- Run the adapter's tick at 1Hz and remove link state checks from it.
Instead, have each port check its link state. Delay the check so that
it takes place slightly after the driver is notified of a change in
link state. This is a cheap way to debounce these notifications if
many are received in rapid succession. POLL_LINK_1ST_TIME flag can
also be eliminated as a side effect of these changes.
- Do not reset the PHY when link goes down.
- Clear port's link_fault flag if the PHY indicates link is down.
- get_link_status_r should leave speed and duplex alone when link is down.

MFC after: 1 month


209840 09-Jul-2010 np

Eliminate ext_intr_task. The "slow" interrupt handler is already
running on the adapter's task queue. Just do what the task does
instead of enqueueing it.

MFC after: 3 days


209839 09-Jul-2010 np

Fix bufsize calculation so that cxgbtool can display information for the
last I/O queue too.

MFC after: 3 days


209321 18-Jun-2010 alc

Catch up with the page and page queues locking changes.


209116 12-Jun-2010 np

cxgb(4): add knob to get packet timestamps from the hardware.

The T3 ASIC can provide an incoming packet's timestamp instead of its RSS hash.
The timestamp is just a counter running off the card's clock. With a 175MHz
clock an increment represents ~5.7ns and the 32 bit value wraps around in ~25s.

# sysctl -d dev.cxgbc.0.pkt_timestamp
dev.cxgbc.0.pkt_timestamp: provide packet timestamp instead of connection hash

# sysctl -d dev.cxgbc.0.core_clock
dev.cxgbc.0.core_clock: core clock frequency (in KHz)
# sysctl dev.cxgbc.0.core_clock
dev.cxgbc.0.core_clock: 175000


209115 12-Jun-2010 np

make format string a string literal.

Reported by: clang


208887 07-Jun-2010 np

cxgb(4): add an 'nfilters' tunable that lets the user place an upper
limit on the number of hardware filters (and thus the amount of TCAM
reserved for filtering).


208356 20-May-2010 np

Remove invalid assertion.

Holding the adapter lock while changing the LRO settings is sufficient.

PR: kern/146759
MFC after: 3 days


207688 05-May-2010 np

Don't ring the tx doorbell for every frame when we know more frames
will follow. Adjust the freelist and response queue doorbells too.

Discussed with: kmacy


207687 05-May-2010 np

Do not hold the T3 firmware in memory all the time. firmware(9) can
load/unload it as needed.


207673 05-May-2010 joel

Switch to our preferred 2-clause BSD license.

Approved by: kmacy


207643 05-May-2010 np

Add support for hardware filters to cxgb(4). The T3 chip can inspect
L2/3/4 headers and can drop or steer packets as instructed. Filtering
based on src ip, dst ip, src port, dst port, 802.1q, udp/tcp, and mac
addr is possible. Add support in cxgbtool to program these filters.
Some simple examples:

Drop all tcp/80 traffic coming from the subnet specified.
# cxgbtool cxgb2 filter 0 sip 192.168.1.0/24 dport 80 type tcp action drop

Steer all incoming UDP traffic to qset 0.
# cxgbtool cxgb2 filter 1 type udp queue 0 action pass

Steer all tcp traffic from 192.168.1.1 to qset 1.
# cxgbtool cxgb2 filter 2 sip 192.168.1.1 type tcp queue 1 action pass

Drop fragments.
# cxgbtool cxgb2 filter 3 type frag action drop

List all filters.
# cxgbtool cxgb2 filter list
index SIP DIP sport dport VLAN PRI P/MAC type Q
0 192.168.1.0/24 0.0.0.0 * 80 0 0/1 */* tcp -
1 0.0.0.0/0 0.0.0.0 * * 0 0/1 */* udp 0
2 192.168.1.1/32 0.0.0.0 * * 0 0/1 */* tcp 1
3 0.0.0.0/0 0.0.0.0 * * 0 0/1 */* frag -
16367 0.0.0.0/0 0.0.0.0 * * 0 0/1 */* * *

MFC after: 2 weeks


207639 04-May-2010 np

Add IFCAP_LINKSTATE to cxgb's capabilities.

MFC after: 3 days


207554 03-May-2010 sobomax

Add new tunable 'net.link.ifqmaxlen' to set default send interface
queue length. The default value for this parameter is 50, which is
quite low for many of today's uses and the only way to modify this
parameter right now is to edit if_var.h file. Also add read-only
sysctl with the same name, so that it's possible to retrieve the
current value.

MFC after: 1 month


206109 02-Apr-2010 np

Increase response queue size to avoid starvation, add a counter
to track it when it does occur.


205950 31-Mar-2010 np

Multiple fixes related to queue set sizing and resources:

- Only the tunnelq (TXQ_ETH) requires a buf_ring, an ifq, and the watchdog/timer
callouts. Do not allocate these for the other tx queues.

- Use 16k jumbo clusters only on offload capable cards by default.

- Do not allocate a full tx ring for the offload queue if the card is not
offload capable.

- Slightly better freelist size calculation.

- Fix nmbjumbo4 typo, remove unneeded global variables.

MFC after: 3 days


205949 31-Mar-2010 np

Fix signed/unsigned mix-up that allowed txq->in_use to grow beyond txq->size.


205948 31-Mar-2010 np

Fix tx drop statistics.

MFC after: 3 days


205947 31-Mar-2010 np

Fix build with "nooptions INET"

Requested by: bz
MFC after: 3 days


205946 31-Mar-2010 np

Do not attempt to retrieve interrupt information before it is available.

MFC after: 3 days


205945 31-Mar-2010 np

Improved PHY EDC settings.

MFC after: 3 days


205944 31-Mar-2010 np

Refresh the firmware version immediately after it is upgraded (or downgraded).

MFC after: 3 days


204921 09-Mar-2010 np

Better TwinAx transceiver detection.

Originally submitted by: <Bruno dot Bittner at isilon dot com>
(This is a rewritten, corrected version of that patch)

MFC after: 1 week


204348 26-Feb-2010 np

Support IFCAP_VLANHWTSO in cxgb(4). It works with or without vlanhwtag.
While here, remove old DPRINTFs and tidy up the capability code a bit.


204274 24-Feb-2010 np

There is no need to test __FreeBSD_version for features that have
been around for a long time now (7.1-ish or even earlier); assume
they are present. These includes MSI, TSO, LRO, VLAN, INTR_FILTERS,
FIRMWARE, etc.

Also, eliminate some dead code and clean up in other places as part
of this quick once-over.

MFC after: 1 week


204271 24-Feb-2010 np

Accessing an mbuf after it has been handed off to the hardware is a bad
race as it could already have been tx'd and freed by that time. Place
the bpf tap just _before_ writing the gen bit.

This fixes a panic when running tcpdump on a cxgb interface.


204111 20-Feb-2010 uqs

Fix common misspelling of hierarchy

Pointed out by: bf1783 at gmail
Approved by: np (cxgb), kientzle (tar, etc.), philip (mentor)


203834 13-Feb-2010 mlaier

Fix drbr and altq interaction:
- introduce drbr_needs_enqueue that returns whether the interface/br needs
an enqueue operation: returns true if altq is enabled or there are
already packets in the ring (as we need to maintain packet order)
- update all drbr consumers
- fix drbr_flush
- avoid using the driver queue (IFQ_DRV_*) in the altq case as the
multiqueue consumer does not provide enough protection, serialize altq
interaction with the main queue lock
- make drbr_dequeue_cond work with altq

Discussed with: kmacy, yongari, jfv
MFC after: 4 weeks


202863 23-Jan-2010 np

Don't forget to release the adapter lock for a no-op.


202678 20-Jan-2010 np

Complain if freelist queue sizes are significantly less than desired.

MFC after: 1 day


202671 20-Jan-2010 np

Fix for a cxgb(4) panic. cxgb_ioctl can be called by the IP and IPv6
layers with non-sleepable locks held. Don't (potentially) sleep in
those situations.


201907 09-Jan-2010 np

Extra parantheses to keep certain compilers happy.

Submitted by: trasz@


201758 07-Jan-2010 mbr

Remove extraneous semicolons, no functional changes.

Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week


200847 22-Dec-2009 jhb

- Rename the __tcpi_(snd|rcv)_mss fields of the tcp_info structure to remove
the leading underscores since they are now implemented.
- Implement the tcpi_rto and tcpi_last_data_recv fields in the tcp_info
structure.

Reviewed by: rwatson
MFC after: 2 weeks


200003 01-Dec-2009 np

T3 firmware 7.8.0 for cxgb(4)

Obtained from: Chelsio
MFC after: 3 days


199868 27-Nov-2009 alc

Simplify the invocation of vm_fault(). Specifically, eliminate the flag
VM_FAULT_DIRTY. The information provided by this flag can be trivially
inferred by vm_fault().

Discussed with: kib


199240 13-Nov-2009 np

Don't disable the XGMAC's tx on ifconfig down. It is unnecessary
and can cause false backpressure in the chip. Fix a us/ms mixup
while here.


199239 13-Nov-2009 np

The 10GBASE-T card should use an IPG of 1. Also enable the check
for low power startup on this card.


199238 13-Nov-2009 np

Make sure *some* edc is setup even for an unknown transceiver (assume
it is optical).


199237 13-Nov-2009 np

sc->rev and is_offload(sc) will always be 0 during probe. Wait till
attach to get correct values.


198988 06-Nov-2009 jhb

Take a step towards removing if_watchdog/if_timer. Don't explicitly set
if_watchdog/if_timer to NULL/0 when initializing an ifnet. if_alloc()
sets those members to NULL/0 already.


197791 05-Oct-2009 np

cxgb(4) updates, including:
- support for the new Gen-2, BT, and LP-CR cards.
- T3 firmware 7.7.0
- shared "common code" updates.

Approved by: gnn (mentor)
Obtained from: Chelsio
MFC after: 1 month


197043 09-Sep-2009 np

There is no need to log anything for a ctrlq stall or restart. These are
normal events.

Approved by: gnn (mentor)
MFC after: 1 month


196840 04-Sep-2009 jhb

Fill the reverse RSS map with 0xff's so that the subsequent loop to
calculate the values will work properly.

Reviewed by: np
MFC after: 1 month


196322 17-Aug-2009 jhb

Purge mergeinfo in sys/ that is either empty or a subset of the parent
mergeinfo on sys/ itself.

Approved by: re (mergeinfo blanket)


196039 02-Aug-2009 rwatson

Many network stack subsystems use a single global data structure to hold
all pertinent statatistics for the subsystem. These structures are
sometimes "borrowed" by kernel modules that require a place to store
statistics for similar events.

Add KPI accessor functions for statistics structures referenced by kernel
modules so that they no longer encode certain specifics of how the data
structures are named and stored. This change is intended to make it
easier to move to per-CPU network stats following 8.0-RELEASE.

The following modules are affected by this change:

if_bridge
if_cxgb
if_gif
ip_mroute
ipdivert
pf

In practice, most of these statistics consumers should, in fact, maintain
their own statistics data structures rather than borrowing structures
from the base network stack. However, that change is too agressive for
this point in the release cycle.

Reviewed by: bz
Approved by: re (kib)


196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


195677 14-Jul-2009 lstewart

Fix a buglet that slipped into r195654. My buildworld/buildkernel sanity
check missed this because cxgb's TOM is currently commented out of the build
system.

Submitted by: Navdeep Parhar <np at FreeBSD dot org>
Approved by: re (kensmith), kensmith (mentor temporarily unavailable)


195654 13-Jul-2009 lstewart

Replace struct tcpopt with a proxy toeopt struct in the TOE driver interface to
the TCP syncache. This returns struct tcpopt to being private within the TCP
implementation, thus allowing it to be modified without ABI concerns.

The patch breaks the ABI. Bump __FreeBSD_version to 800103 accordingly. The cxgb
driver is the only TOE consumer affected by this change, and needs to be
recompiled along with the kernel.

Suggested by: rwatson
Reviewed by: rwatson, kmacy
Approved by: re (kensmith), kensmith (mentor temporarily unavailable)


195512 09-Jul-2009 np

Fix cxgb(4) panic with jumbo frames.

Reviewed by: kmacy
Approved by: re (kib), gnn (mentor)


195071 26-Jun-2009 rwatson

Use if_maddr_rlock() instead of IF_ADDR_LOCK() to protect access to
if_multiaddrs in if_cxgb.

Approved by: re (kib)
MFC after: 6 weeks


195006 25-Jun-2009 np

mvec routines should have no knowledge of the SG engine.

Reviewed by: kmacy
Approved by: gnn (mentor)


194921 24-Jun-2009 np

Various ifmedia related fixes in cxgb(4), including:

- build ifmedia list based on phy->caps, not string comparisons.
- rebuild media list when a transceiver change is detected.
- return EOPNOTSUPP instead of ENXIO in cxgb_media_status.

Approved by: gnn (mentor)
MFC after: 2 weeks.


194739 23-Jun-2009 bz

After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.


194661 22-Jun-2009 np

Fix cxgb's ifmedia ioctl handling. Also fixed a comment.

Reviewed by: kmacy
Approved by: gnn (mentor)


194622 22-Jun-2009 rwatson

Add a new function, ifa_ifwithaddr_check(), which rather than returning
a pointer to an ifaddr matching the passed socket address, returns a
boolean indicating whether one was present. In the (near) future,
ifa_ifwithaddr() will return a referenced ifaddr rather than a raw
ifaddr pointer, and the new wrapper will allow callers that care only
about the boolean condition to avoid having to free that reference.

MFC after: 3 weeks


194563 21-Jun-2009 kmacy

fix !x86 cxgb compile


194554 20-Jun-2009 kmacy

fix typo in conditional


194553 20-Jun-2009 kmacy

- fix dma map handling for !x86 case
- fix allocation failure handing in refill_fl


194521 19-Jun-2009 kmacy

Greatly simplify cxgb by removing almost all of the custom mbuf management logic

- remove mbuf iovec - useful, but adds too much complexity when isolated to
the driver

- remove driver private caching - insufficient benefit over UMA to justify
the added complexity and maintenance overhead

- remove separate logic for managing multiple transmit queues, with the
new drbr routines the control flow can be made to much more closely resemble
legacy drivers

- remove dedicated service threads, with per-cpu callouts one can get the same
benefit much more simply by registering a callout 1 tick in the future if there
are still buffered packets

- remove embedded mbuf usage - Jeffr's changes will (I hope) soon be integrated
greatly reducing the overhead of using kernel APIs for reference counting
clusters

- add hysteresis to descriptor coalescing logic

- add coalesce threshold sysctls to allow users to decide at run-time
between optimizing for forwarding / UDP or optimizing for TCP

- add once per second watchdog to effectively close the very rare races
occurring from coalescing

- incorporate Navdeep's changes to the initialization path required to
convert port and adapter locks back to ordinary mutexes (silencing BPF
LOR complaints)

- enable prefetches in get_packet and tx cleaning

Reviewed by: navdeep@
MFC after: 2 weeks


194259 15-Jun-2009 sam

r193336 moved ifq_detach to if_free which broke if_alloc followed
by if_free (w/o doing if_attach); move ifq_attach to if_alloc and
rename ifq_attach/detach to ifq_init/ifq_delete to better identify
their purpose

Reviewed by: jhb, kmacy


194039 11-Jun-2009 gnn

Re-add the send queue tunable for people who do not use buffering.

Reviewed by: jhb
MFC after: 3 days


193925 10-Jun-2009 gnn

Add a missing error statistic, the number of FCS errors on receive.

Reviewed by: jhb
MFC after: 1 day


193848 09-Jun-2009 kmacy

- add drbr routines for accessing #qentries and conditionally dequeueing
- track bytes enqueued in buf_ring


193744 08-Jun-2009 bz

After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.


193272 01-Jun-2009 jhb

Rework socket upcalls to close some races with setup/teardown of upcalls.
- Each socket upcall is now invoked with the appropriate socket buffer
locked. It is not permissible to call soisconnected() with this lock
held; however, so socket upcalls now return an integer value. The two
possible values are SU_OK and SU_ISCONNECTED. If an upcall returns
SU_ISCONNECTED, then the soisconnected() will be invoked on the
socket after the socket buffer lock is dropped.
- A new API is provided for setting and clearing socket upcalls. The
API consists of soupcall_set() and soupcall_clear().
- To simplify locking, each socket buffer now has a separate upcall.
- When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from
the receive socket buffer automatically. Note that a SO_SND upcall
should never return SU_ISCONNECTED.
- All this means that accept filters should now return SU_ISCONNECTED
instead of calling soisconnected() directly. They also no longer need
to explicitly clear the upcall on the new socket.
- The HTTP accept filter still uses soupcall_set() to manage its internal
state machine, but other accept filters no longer have any explicit
knowlege of socket upcall internals aside from their return value.
- The various RPC client upcalls currently drop the socket buffer lock
while invoking soreceive() as a temporary band-aid. The plan for
the future is to add a new flag to allow soreceive() to be called with
the socket buffer locked.
- The AIO callback for socket I/O is now also invoked with the socket
buffer locked. Previously sowakeup() would drop the socket buffer
lock only to call aio_swake() which immediately re-acquired the socket
buffer lock for the duration of the function call.

Discussed with: rwatson, rmacklem


193270 01-Jun-2009 zec

Update VNET base pointer setting macro to use a correct source of
vnet context.

Approved by: julian (mentor)


192933 27-May-2009 gnn

Rework interrupt bringup and teardown.

Calculate the exact number of vectors we'll use before calling
pci_alloc_msix. Don't grab nine all the time.

Call cxgb_setup_interrupts once per T3, not once per port. Ditto
for cxgb_teardown_interrupts.

Don't leak resources when interrupt setup fails in the middle.

Obtained from: Navdeep Parhar
MFC after: 10 days


192593 22-May-2009 gnn

Partial reversion of previous commit. The CXGB_SHUTDOWN flag does NOT
need to be inverted when doing an ifconfig down of an interface.

Pointed out by: Navdeep Parhar
MFC after: 1 week


192584 22-May-2009 gnn

Fix a possible panic cxgb_controller_attach() routine that would occur
only if prepping the adapter failed.

Slight adjustment to comments.

Fix a bug whereby downing the interface didn't preven it from
processing packets.

Submitted by: Navdeep Parhar
MFC after: 1 week


192540 21-May-2009 gnn

Integrate three changes from Chelsio.

1) Add a sysctl that will say what type of PHYs exist on the card.
2) Fix a bug that occurs when an AEL 2005 PHY resets without a transciever
in the card.
3) Unify the PHY link detection code.

Obtained from: Navdeep Parhar
MFC after: 10 days


192537 21-May-2009 gnn

Modified the attach and detach routines to handle bringing ports up
and down more cleanly. This addresses a problem where if we have the
link flap during boot the driver would lock up the system.

Reviewed by: jhb
MFC after: 1 week


192450 20-May-2009 imp

We no longer need to use d_thread_t, migrate to struct thread *.


192009 12-May-2009 kmacy

fix bug introduced by last change

Submitted by: Navdeep Parhar


191816 05-May-2009 zec

Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one. The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE(). Recursions
on curvnet are permitted, though strongly discuouraged.

This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.

The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.

The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.

This change also introduces a DDB subcommand to show the list of all
vnet instances.

Approved by: julian (mentor)


191610 27-Apr-2009 kmacy

simplify by removing dead code


190948 11-Apr-2009 rwatson

Update stats in struct tcpstat using two new macros, TCPSTAT_ADD() and
TCPSTAT_INC(), rather than directly manipulating the fields across the
kernel. This will make it easier to change the implementation of
these statistics, such as using per-CPU versions of the data structures.

MFC after: 3 days


190880 10-Apr-2009 kmacy

Import "flowid" support for serializing flows across transmit queues

Reviewed by: rwatson and jeli


190633 01-Apr-2009 piso

Implement an ipfw action to reassemble ip packets: reass.


190581 30-Mar-2009 mav

Integrate user/mav/ata branch:

Add ch_suspend/ch_resume methods for PCI controllers and implement them
for AHCI. Refactor AHCI channel initialization according to it.

Fix Port Multipliers operation. It is far from perfect yet, but works now.
Tested with JMicron JMB363 AHCI + SiI 3726 PMP pair.
Previous version was also tested with SiI 4726 PMP.

Hardware sponsored by: Vitsch Electronics / VEHosting.nl


190330 23-Mar-2009 gnn

Minor updates to the Chelsio driver, including removing an LOR.

Submitted by: Navdeep Parhar at Chelsio
Reviewed by: gnn
MFC after: 3 weeks


190206 21-Mar-2009 gnn

Fix a bug in the recent update to the Chelsio driver.
The tick routine was not being restarted in the init_locked routine
which could resulted in loss of carrier when updating the MTU.

Submitted by: Navdeep Parhar at Chelsio
MFC after: 3 weeks


189699 11-Mar-2009 dfr

Merge in support for Xen HVM on amd64 architecture.


189655 10-Mar-2009 rwatson

Prefer ENETDOWN to ENXIO when returning queuing errors due to a link
down, interface down, etc, with if_cxgb's if_transmit routine.

MFC after: 3 days
Reviewed by: kmacy


189643 10-Mar-2009 gnn

Update the Chelsio driver to the latest bits from Chelsio

Firmware upgraded to 7.1.0 (from 5.0.0).
T3C EEPROM and SRAM added; Code to update eeprom/sram fixed.
fl_empty and rx_fifo_ovfl counters can be observed via sysctl.
Two new cxgbtool commands to get uP logic analyzer info and uP IOQs
Synced up with Chelsio's "common code" (as of 03/03/09)

Submitted by: Navdeep Parhar at Chelsio
Reviewed by: gnn
MFC after: 2 weeks


189106 27-Feb-2009 bz

For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.


186282 18-Dec-2008 gnn

Check in the actual module recognition code for the Chelsio
driver.

Obtained from: Chelsio Inc.


186222 17-Dec-2008 bz

Use inc_flags instead of the inc_isipv6 alias which so far
had been the only flag with random usage patterns.
Switch inc_flags to be used as a real bit field by using
INC_ISIPV6 with bitops to check for the 'isipv6' condition.

While here fix a place or two where in case of v4 inc_flags
were not properly initialized before.[1]

Found by: rwatson during review [1]
Discussed with: rwatson
Reviewed by: rwatson
MFC after: 4 weeks


186119 15-Dec-2008 qingli

This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion


185662 06-Dec-2008 gnn

Bug fix to support N310 version of Chelsio cards (board ID 1088).

Obtained from: Chelsio Inc.
MFC after: 3 days


185655 05-Dec-2008 gnn

Re submit code to print the part and serial number for Chelsio cards.
The original code was accidentally removed in another commit.

MFC after: 1 day


185620 04-Dec-2008 gnn

Fix a bug with the ael1006 PHY. The bug shows up as persistent but incomplete
packet loss, of between 10-30%. The fix is to put the PHY into
and take it out of local loopback mode when resetting the interface.

Obtained from: Chelsio Inc.
MFC after: 3 days


185571 02-Dec-2008 bz

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


185564 02-Dec-2008 gnn

Bug fix from Chelsio which addresses the issue of the device resetting
when it sees only received packets. In some cases where a device only
recieves data it mistakenly thinks that its transmitting side is broken
and resets the device.

Obtained from: Chelsio Inc.
MFC after: 3 days


185549 02-Dec-2008 kmacy

- fix bug where dnsperf would stop transmitting after a few seconds
- break complex conditionals in to multiple lines to avoid wrapping
- remove copious unused debug statements
- be more aggressive about cleaning in the calling thread
- eliminate usage of ENOSPC
- increase number of iterations that cxgbsp can do
- eliminate "initerr" usage to simplify ENOBUFS handling
- when coalescing pass all packets to BPF
- always set overrun if hardware queue is full


185537 02-Dec-2008 kmacy

The pkthdr field is flowid not rss_hash


185536 02-Dec-2008 kmacy

- fix multiqueue conditional
- don't leak mbuf tags in the non-conditional case

Found by: Navdeep Parhar


185535 02-Dec-2008 kmacy

integrate use after free fixes from private branch

Found by: kkenn@


185509 01-Dec-2008 kmacy

null out m_next when marshalling a packet


185508 01-Dec-2008 kmacy

Update internal mac stats every time the tick task is called
if we don't do this "netstat -w 1" will frequently see negative
differences in packets sent


185507 01-Dec-2008 kmacy

don't manually track statistics


185506 01-Dec-2008 kmacy

Proper fix for tracking ifnet statistics


185199 23-Nov-2008 kmacy

Add backward compatibility ifdefs for non-multiq kernels


185194 23-Nov-2008 kmacy

work around periodic leak on queue overrun by enabling coalescing of packets in to
work requests by default


185191 23-Nov-2008 kmacy

intr_machdep.h breaks build on some arches and is not needed


185165 22-Nov-2008 kmacy

- enable multiple transmit queues
- invert sense of hw.cxgb.singleq tunable to hw.cxgb.multiq
- don't wake up transmitting thread by default
- add per tx queue ifaltq to handle ALTQ
- remove several unused functions in cxgb_multiq.c
- add several sysctls: multiq_tx_enable, coalesce_tx_enable,
and wakeup_tx_thread
- this obsoletes the hw.cxgb.snd_queue_len as ifq is replaced
by a buf_ring


185162 22-Nov-2008 kmacy

- bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.


185157 21-Nov-2008 gnn

Several small additions to the Chelsio 10G driver.

1) Fix a bug in dealing with the Alerus 1006 PHY which prevented the
device from ever coming back up once it had been set to down.

2) Add a kernel tunable (hw.cxgb.snd_queue_len) which makes it possible
to give the device more than IFQ_MAXLEN entries in its send queue. The
default remains 50.

3) Add code to place the card'd identification and serial number into
its description (%desc) so that users can tell which card they have
installed.


185088 19-Nov-2008 zec

Change the initialization methodology for global variables scheduled
for virtualization.

Instead of initializing the affected global variables at instatiation,
assign initial values to them in initializer functions. As a rule,
initialization at instatiation for such variables should never be
introduced again from now on. Furthermore, enclose all instantiations
of such global variables in #ifdef VIMAGE_GLOBALS blocks.

Essentialy, this change should have zero functional impact. In the next
phase of merging network stack virtualization infrastructure from
p4/vimage branch, the new initialization methology will allow us to
switch between using global variables and their counterparts residing in
virtualization containers with minimum code churn, and in the long run
allow us to intialize multiple instances of such container structures.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


184861 12-Nov-2008 kmacy

Update firmware version check
make ddp a tunable

Obtained from: Chelsio Inc.
MFC after: 3 days


184715 06-Nov-2008 bz

For now our LRO code (tcp_lro.c) only supports IPv4 properly thus
only enable if INET is on.

Reviewed by: kmacy
MFC after: 2 months


184714 06-Nov-2008 bz

Hide AF_INET specific ioctl handling under #ifdef INET.

Reviewed by: kmacy
MFC after: 2 months


183967 17-Oct-2008 kmacy

Track number of packets transmitted and number of packets received

PR: 125806
MFC after: 3 days


183559 03-Oct-2008 kmacy

Fix bug in LRO on T304 whereby a packet could be sent to the wrong interface's ifp.

Submitted by: Chelsio Inc.
MFC after: 1 day


183550 02-Oct-2008 zec

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


183508 30-Sep-2008 kmacy

update callers of vm_fault_hold_user_pages

MFC after: 1 week


183507 30-Sep-2008 kmacy

Refactor vm_fault_hold_user_pages:
- simplify page hold logic
- allow pages for processes other than that of curthread to
have pages held
- normalize the interface to more closely resemble the functions in
sys/vm

MFC after: 1 week


183506 30-Sep-2008 kmacy

Make sure that optical PHYs work ...

Submitted by: Chelsio Inc.
MFC after: 1 day


183478 29-Sep-2008 kmacy

vm_fault_hold_user_pages will not return if an address in the range passed in is mapped RO
but an RW mapping exists for the underlying page. This change fixes the bug by using the
page / NULL returned from pmap_extract_and_hold to determine whether or not vm_fault needs
to be called.

The bug was pointed out by alc.

MFC after: 3 days


183339 25-Sep-2008 kmacy

fix insta-panic:
- determine which ext_arg offsets to use based on the version number

Submitted by: Chelsio Inc.
MFC after: 1 day


183321 24-Sep-2008 kmacy

- Remove default NIC dependency on ulp headers
- make toe module build dependent on kernel support

Submitted by: Chelsio Inc.
MFC after: 1 week


183292 23-Sep-2008 kmacy

Update cxgb include paths to not require prefixing with dev/cxgb

Submitted by: Chelsio Inc.


183289 23-Sep-2008 kmacy

Allow cxgb to be unified across versions by making newer features conditional

Submitted by: Chelsio Inc
MFC after: 3 days


183286 23-Sep-2008 kmacy

- Fix flag check
- Fix adaptive thread sleep
- set oactive when queue is full


183285 23-Sep-2008 kmacy

- Track number of times that the transmit queue overflowed
- Trivial whitespace cleanup

MFC after: 3 days


183199 19-Sep-2008 kmacy

Fix issue with tom loading by moving cxgb_log_tcb in to tom

MFC after: 3 days


183163 18-Sep-2008 kmacy

Fix two panics:

1. panic: rtalloc1_fib: bad fibnum

2. panic: Lock tcpinp not exclusively locked
@ /usr/src/sys/netinet/in_pcb.c:1284

Submitted by: Chelsio Inc.
MFC after: 3 days


183113 17-Sep-2008 attilio

Remove the suser(9) interface from the kernel. It has been replaced from
years by the priv_check(9) interface and just very few places are left.
Note that compatibility stub with older FreeBSD version
(all above the 8 limit though) are left in order to reduce diffs against
old versions. It is responsibility of the maintainers for any module, if
they think it is the case, to axe out such cases.

This patch breaks KPI so __FreeBSD_version will be bumped into a later
commit.

This patch needs to be credited 50-50 with rwatson@ as he found time to
explain me how the priv_check() works in detail and to review patches.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reviewed by: rwatson


183063 16-Sep-2008 kmacy

Further whitespace and copyright cleanups to minimize the
delta with RELENG_7.


183062 16-Sep-2008 kmacy

White space cleanups to bring closer to RELENG_7


183059 16-Sep-2008 kmacy

Remove some dead code along with gratuitous differences between HEAD and 7


182882 09-Sep-2008 kmacy

Fix issue with recovering from transient jumbo mbuf shortage.

Submitted by: Chelsio Inc.
MFC after: 3 days


182741 03-Sep-2008 julian

New file missed vimagification.


182695 02-Sep-2008 kmacy

Indicate at probe time if device can do offload and which revision it is

MFC after: 3 days


182679 02-Sep-2008 kmacy

Import ioctl updates for latest rev of cxgbtool

Obtained from: Chelsio Inc.
MFC after: 3 days


182591 01-Sep-2008 kmacy

Don't check if an interface can do tcp offload if there are no offload devices registered on the system.

Suggested by: rwatson
MFC after: 3 days


181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


181653 13-Aug-2008 kmacy

Fix runt TSO packet issue.

Obtained from: Chelsio Inc.
MFC after: 1 week


181652 13-Aug-2008 kmacy

Add LRO and MAC statistics to exported sysctls.

Obtained from: Chelsio Inc.
MFC after: 1 week


181616 12-Aug-2008 kmacy

Remove cxgb private lro implementation and switch to using system implementation.

Obtained from: Chelsio Inc.
MFC after: 1 week


181614 11-Aug-2008 kmacy

Vendor fix for PHY problem.

Obtained from: Chelsio Inc.
MFC after: 3 days


181067 31-Jul-2008 kmacy

remove socketvar.h, add more selective includes


181039 31-Jul-2008 ps

Unbreak the build by including sys/socketvar.h


181011 30-Jul-2008 kmacy

fix includes for post sockbuf re-factor


180675 21-Jul-2008 kmacy

remove call to unsafe tcp_twstart function


180651 21-Jul-2008 kmacy

remove unneeded declarations


180650 21-Jul-2008 kmacy

remove local version of tcp_offload_* functions


180649 21-Jul-2008 kmacy

update syncache function names


180647 21-Jul-2008 kmacy

remove cxgb local definition of locked syncache_expand


180644 21-Jul-2008 kmacy

remove cxgb local definitions of socket accessor functions


180586 18-Jul-2008 kmacy

new vendor PHY support


180583 18-Jul-2008 kmacy

import vendor fixes to cxgb


178800 05-May-2008 kmacy

conditionally define PANIC_IF, remove 'unlikely'


178796 05-May-2008 kmacy

LINT fixes


178786 05-May-2008 kmacy

import support for iwarp on Chelsio T3 card

Supported by Chelsio Inc.


178767 05-May-2008 kmacy

MFSVN:
- add / remove clients from cxgb_main.c now
- change ifdef TOE_ENABLED to TCP_OFFLOAD_DISABLE
- update copyrights
- fix transmit data mismatch bug caused by not setting SB_NOCOALESCE
on tx sockbuf on passive connections
- fix receive sequence mismatch bug caused by not setting SB_NOCOALESCE
on rx sockbuf on passive connections
- don't sleep without checking SBS_CANTRCVMORE first
- various ddp ordering fixes

Supported by: Chelsio Inc.


178304 19-Apr-2008 kmacy

remove kdb_backtrace() call


178302 19-Apr-2008 kmacy

move cxgb_lt2.[ch] from NIC to TOE
move most offload functionality from NIC to TOE
factor out all socket and inpcb direct access
factor out access to locking in incpb, pcbinfo, and sockbuf


178285 17-Apr-2008 rwatson

Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to
explicitly select write locking for all use of the inpcb mutex.
Update some pcbinfo lock assertions to assert locked rather than
write-locked, although in practice almost all uses of the pcbinfo
rwlock main exclusive, and all instances of inpcb lock acquisition
are exclusive.

This change should introduce (ideally) little functional change.
However, it lays the groundwork for significantly increased
parallelism in the TCP/IP code.

MFC after: 3 months
Tested by: kris (superset of committered patch)


177807 31-Mar-2008 kmacy

reduce the size of the jumbo ring on i386 and disable pcpu cluster caching


177575 24-Mar-2008 kmacy

change inp_wlock_assert to inp_lock_assert


177540 24-Mar-2008 kmacy

remove unneccessary tcbinfo lock acquisitions - set tp to null affter calling enter_timewait as we no longer own the inpcb


177530 23-Mar-2008 kmacy

Insulate inpcb consumers outside the stack from the lock type and offset within the pcb by adding accessor functions.

Reviewed by: rwatson
MFC after: 3 weeks


177464 20-Mar-2008 kmacy

pay attention to default cluster limits when sizing receive queues


177415 19-Mar-2008 kmacy

fix link management bug and conditionally allow the PHY to be kept on at all times for allowing non-conformant link state checks


177340 18-Mar-2008 kmacy

- Integrate 1.133 vendor driver changes
- update some copyrights
- add improved support for delayed ack
- fix issue with fec


176615 26-Feb-2008 kmacy

Parameterize for module name


176614 26-Feb-2008 kmacy

Remove unused files


176613 26-Feb-2008 kmacy

move remaining binaries in to blob headers


176572 26-Feb-2008 kmacy

Move firmware in to separate module that can be compiled statically in to the kernel
Add utility for converting future firmware revs to a C header file


176563 25-Feb-2008 keramida

Spell 'overwriting' correctly in a KASSERT() message.


176507 24-Feb-2008 kmacy

Fix namespace collision with sparc macro


176494 23-Feb-2008 kmacy

remove call to kdb_backtrace()


176475 23-Feb-2008 kmacy

Fix tinderbox by removing call to kdb_backtrace

MFC after: 3 days


176472 23-Feb-2008 kmacy

- update firmware to 5.0
- add support for T3C
- add DDP support (zero-copy receive)
- fix TOE transmit of large requests
- fix shutdown so that sockets don't remain in CLOSING state indefinitely
- register listeners when an interface is brought up after tom is loaded
- fix setting of multicast filter
- enable link at device attach
- exit tick handler if shutdown is in progress
- add helper for logging TCB
- add sysctls for dumping transmit queues

- note that TOE wxill not be MFC'd until after 7.0 has been finalized

MFC after: 3 days


175872 01-Feb-2008 phk

Give MEXTADD() another argument to make both void pointers to the
free function controlable, instead of passing the KVA of the buffer
storage as the first argument.

Fix all conventional users of the API to pass the KVA of the buffer
as the first argument, to make this a no-op commit.

Likely break the only non-convetional user of the API, after informing
the relevant committer.

Update the mbuf(9) manual page, which was already out of sync on
this point.

Bump __FreeBSD_version to 800016 as there is no way to tell how
many arguments a CPP macro needs any other way.

This paves the way for giving sendfile(9) a way to wait for the
passed storage to have been accessed before returning.

This does not affect the memory layout or size of mbufs.

Parental oversight by: sam and rwatson.

No MFC is anticipated.


175712 27-Jan-2008 kmacy

Fix loading for case where we don't overload tcp_usrreqs by calling tcp_drop directly


175711 27-Jan-2008 kmacy

fix DISABLE_MBUF_IOVEC case by initializing mbuf header completely


175504 19-Jan-2008 kmacy

Re-enable pcpu caching by default make sysctl R/W


175415 17-Jan-2008 kmacy

- remove bogus_imm counter
- disable pcpu cluster cache by default until reference counting is handled
correctly for held clusters - can be re-enable by sysctl


175414 17-Jan-2008 sam

promote ath_defrag to m_collapse (and retire private+unused
m_collapse from cxgb)

Reviewed by: pyun, jhb, kmacy
MFC after: 2 weeks


175389 16-Jan-2008 kmacy

Fix lock ordering panic by not calling ether_ioctl with port lock held

Reported by: rrs


175378 16-Jan-2008 kmacy

remove superfluous debug printfs


175375 16-Jan-2008 kmacy

Fix mbuf leak caused by freeing packet zone clusters but not their associated mbufs

- Track packet zone mbufs separately from other mbufs
- free packet zone buffers via m_free rather than trying to manage the refcount
as with clusters - its refcount and management seems to be "special"


175374 16-Jan-2008 kmacy

put tx queue size back to 1024


175369 15-Jan-2008 jhb

Use '%zd' to print PIO_LEN since it involves a size_t (via sizeof()) to
appease the tinderbox on 32-bit platforms.

Tested on: amd64, i386


175347 15-Jan-2008 kmacy

- Simplify mb_free_ext_fast
- increase asserts for mbuf accounting
- track outstanding mbufs (maps very closely to leaked)
- actually only create one thread per port if !multiq
Oddly enough this fixes the use after free

- move txq_segs to stack in t3_encap
- add checks that pidx doesn't move pass cidx
- simplify mbuf free logic in collapse mbufs routine


175340 15-Jan-2008 kmacy

- move WR_LEN in to cxgb_adapter.h add PIO_LEN to make intent clearer
- move cxgb_tx_common in to cxgb_multiq.c and rename to cxgb_tx
- move cxgb_tx_common dependencies
- further simplify cxgb_dequeue_packet for the non-multiqueue case
- only launch one service thread per port in the non-multiq case
- remove dead cleaning code from cxgb_sge.c
- simplify PIO case substantially in by returning directly from mbuf collapse
and just using m_copydata
- remove gratuitous m_gethdr in the rx path
- clarify freeing of mbufs in collapse


175339 15-Jan-2008 kmacy

remove superfluous locking from dequeue


175316 14-Jan-2008 kmacy

- Assert that immpkt is not set
- convert %lx to 32-bit safe %jx


175313 14-Jan-2008 kmacy

- Add more extensive sanity checks
- remove initial dequeue from cxgb_start as it was causing an mbuf to be referenced twice


175312 14-Jan-2008 kmacy

Make back pressure visible more quickly, particularly now that we maintain a queue internally


175311 14-Jan-2008 kmacy

Add extensive sanity checking to buf_ring


175305 13-Jan-2008 kmacy

Convert over to using the multiqueue infrastructure although all calls going
through cxgb_start still end up using queue 0


175304 13-Jan-2008 kmacy

Add buf_ring_full utility function, make sure dequeue/enqueue see the latest
indexes


175303 13-Jan-2008 kmacy

remove unused code


175302 13-Jan-2008 kmacy

style nit


175249 12-Jan-2008 kmacy

MFp4 multiple queue support


175224 11-Jan-2008 kmacy

Be more aggressive about tx cleaning - when multiples streams were running the tx
queue could fill up and stop getting cleaned.


175223 10-Jan-2008 kmacy

If we're not running with multiqueue enabled we need to wait to acquire the
rspq lock. Not doing so was causing us to skip re-enabling the interrupt.

- remove duplicate credits sysctl
- add support for dumping hardware context of the txq
- decrement budget_left when we break out of the process_responses loop


175209 10-Jan-2008 kmacy

Add support for selectively dumping the state of the hardware response queue.
Change ordering of a couple of types.


175208 10-Jan-2008 kmacy

should always free when refcount is 1


175200 10-Jan-2008 kmacy

- make 9k clusters the default unless a tunable is set
- return the error from cxgb_tx_common so that when an error is hit we dont
spin forever in the taskq thread
- remove unused rxsd_ref
- simplify header_offset calculation for embedded mbuf headers
- fix memory leak by making sure that mbuf header initialization took place
- disable printf's for stalled queue, don't do offload/ctrl queue restart
when tunnel queue is restarted
- add more diagnostic information about the txq state
- add facility to dump the actual contents of the hardware queue using sysctl


175174 09-Jan-2008 kmacy

make nqsets a uint32_t so that sysctl will work
add 2 fields for allowing queue dumping


175172 09-Jan-2008 kmacy

don't decrement ref count below 1 for EXT_PACKET


175171 09-Jan-2008 kmacy

EXT_PACKET is one of the valid mbuf types


175121 07-Jan-2008 kmacy

Fix mvec code to handle the case of the packet zone
this was missed in the initial import


175025 31-Dec-2007 julian

Don't duplicate the whole of arpresolve to arpresolve 2 for the sake
of two compares against 0. The negative effect of cache flushing
is probably more than the gain by not doing the two compares (the
value is almost certainly in register or at worst, cache).
Note that the uses of m_freem() are in error cases and m_freem()
handles NULL anyhow. So fast-path really isn't changed much at all.


174758 18-Dec-2007 kmacy

Don't overload tcp_usrreqs unless the kernel doesn't provide offload support.


174726 17-Dec-2007 kmacy

only include intr_machdep.h when it is needed for intr_bind
ia64 doesn't have an intr_machdep.h


174712 17-Dec-2007 kmacy

disable update in place on transmit


174708 17-Dec-2007 kmacy

Make TCP offload work on HEAD (modulo negative interaction between sbcompress
and t3_push_frames).
- Import latest changes to cxgb_main.c and cxgb_sge.c from toestack p4 branch
- make driver local copy of tcp_subr.c and tcp_usrreq.c and override tcp_usrreqs so
TOE can also functions on versions with unmodified TCP

- add cxgb back to the build


174686 16-Dec-2007 kmacy

Include cdefs.h and param.h for architectures with less header pollution


174672 16-Dec-2007 kmacy

Use the vm include convention of busdma


174671 16-Dec-2007 kmacy

need M_IOVEC define


174670 16-Dec-2007 kmacy

Don't globally include mvec.h its only needed by cxgb_sge.c


174652 16-Dec-2007 kmacy

Don't use old-style mbuf iovecs


174641 16-Dec-2007 kmacy

Add driver for TCP offload

Sponsored by: Chelsio Inc.


174640 16-Dec-2007 kmacy

Update the buffer management support code needed by the tcp offload module


174639 16-Dec-2007 kmacy

Sanitize of a routine that is going away


174638 16-Dec-2007 kmacy

overlead mbuf fields for use by toe


174637 16-Dec-2007 kmacy

Add system includes for mvec.h


174626 15-Dec-2007 kmacy

Import updated support code for the TOM (tcp offload module).


172147 11-Sep-2007 kmacy

Evidently setup_rss needs to happen whenever bind_qsets is done. This fixes
a problem with jumbo frames when not using msi-x interrupts.

Supported by: Chelsio
Approved by: re (blanket)


172109 10-Sep-2007 kmacy

pull in changes made to RELENG_6 version in the process of doing the MFC

Supported by: Chelsio
Approved by: re (blanket)


172105 09-Sep-2007 kmacy

- Remove filter support

Supported by: Chelsio
Approved by: re(blanket)


172101 09-Sep-2007 kmacy

Add back in support for normal mbuf chaining on RX under DISABLE_MBUF_IOVEC

Approved by: re(blanket)
Supported by: Chelsio


172100 09-Sep-2007 kmacy

Fix last-minute typo in last commit caused by pre-commit scripts

Approved by: re(blanket)


172096 09-Sep-2007 kmacy

- fix qset to port binding as a proper fix for the problems encountered on the 4-port
- fix the use after free seen when sending packets small enough to fit as an immediate
and bpf peers are present
- update to firmware rev 4.7 along with various small vendor fixes

Supported by: Chelsio
Approved by: re (blanket)
MFC after: 3 days


171978 25-Aug-2007 kmacy

Fixes for 4 port and small packet optimization

- remove cpl->iff panic - we can't know the port number from the rspq on the 4-port
- pick the ifnet based on the interface in the CPL header
- switch to using qset 0 for egress on the 4-port for now - may change
when we start using RSS
- move ether_ifdetach to before the port lock gets deinitialized to avoid
hang in the case where there are BPF peers (cxgb_ioctl is called indirectly
when BPF peers are present)
- don't call t3_mac_reset if multiport is set, this was causing tx errors
by misconfiguring the MAC on the 4-port
- change V_TXPKT_INTF to use txpkt_intf as the interfaces are not contiguous
- free the mbuf immediately in the case where the payload is small enough to be copied
into the rspq
- only update the coalesce timer if for a queue if packets were taken off of it
- add in missed 20ms DELAY in initializaton vsc8211

- prompt MFC as this only applies to the 4-port which is currently completely
broken - OK'd by kensmith

Supported by: Chelsio
Approved by: re (blanket)
MFC after: 0 days


171868 17-Aug-2007 kmacy

forward port signedness fixes from RELENG_6
fix compile error for case where MSI_SUPPORTED not defined

Approved by: re (blanket)


171804 10-Aug-2007 kmacy

White space cleanups

Approved by: re (blanket)


171803 10-Aug-2007 kmacy

- In all structures other than port info port is a pointer to a port info,
make the code less confusing by renaming the port number to port_id

Approved by: re (blanket)


171471 17-Jul-2007 kmacy

- integrate most recent changes from vendor branch and upgrade to firmware revision 4.5.5
- add filter support
- further improvements for T304
- recover gracefully from spurious immediate packets

Approved by: re(blanket)
Supported by: Chelsio
MFC after: 3 days


171469 17-Jul-2007 kmacy

- Increase descriptors per call to start
- enqueue per-txq task
- fix per-txq task initialization

Approved by: re (blanket)


171335 10-Jul-2007 kmacy

MFp4 122896
- reduce cpu usage by as much as 25% (40% -> 30) by doing txq reclaim more efficiently
- use mtx_trylock when trying to grab the lock to avoid spinning during long encap loop
- add per-txq reclaim task
- if mbufs were successfully re-claimed try another pass
- track txq overruns with sysctl

Approved by: re (blanket)


170869 17-Jun-2007 kmacy

- switch adapter and port lock over to using sx so that resources
can be allocated atomically
- add debug macros for printing lock initialization / teardown
- add buffers to port_info and adapter to allow each lock to have a
unique name
- destroy mutexes initialized by cxgb_offload_init
- remove recursive calls to ADAPTER_LOCK
- move callout_drain calls so that they don't occur with the lock held
- ensure that only as many qsets as are needed are initialized and
destroyed

MFC after: 3 days
Sponsored by: Chelsio Inc.


170789 15-Jun-2007 kmacy

Fix build warnings
Submitted by: mjacob@


170654 13-Jun-2007 kmacy

- import new common code for the T304
- update to firmware version 4.1.0

- switch over to standard method for initializing cdevs (contributed by scottl@)
- break out timer_reclaim_task to be per-port
- move msix teardown into separate function
- fix bus_setup_intr for msi-x for the multi-port case so that msi-x resources
are not corrupted on unload
- handle 10/100/1000 base-T media and auto negotiation
- bind qset to cpu even for singleq case
- white space cleanups
- remove recursive PORT_LOCK
- move mtu setting to separate function
- stop and re-init port when changing mtu
- replace all direct references to m_data with calls to mtod
- handle attach failure better by not trying to de-initialize
taskqueues when they have not been allocated
- no longer default to jumbo frames

Sponsored by: Chelsio
MFC after: 3 days


170197 02-Jun-2007 kmacy

remove pointless recursive acquisition of port lock in cxgb_init_locked


170083 29-May-2007 kmacy

Fix case of setting OACTIVE erroneously


170081 29-May-2007 kmacy

Fix interrupt setup for the non-MSI-X case


170076 28-May-2007 kmacy

When building cxgb as a module make include paths relative to the driver's root.
This will make it possible to build the module out of tree against an older src tree.

MFC after: 3 days


170038 27-May-2007 kmacy

Tuning for small packet handling
- Double the number of descriptors that a single call to send can use
- Quadruple the number of descriptors that can be reclaimed per pass
- only run reclaim twice per second
- increase coalesce timer from 3.5us to 5us

fix printf warning on 64-bit platforms


170037 27-May-2007 kmacy

Don't bind queue to cpus if only one queue is in use


170008 27-May-2007 kmacy

fix compile warning by removing redundant LOG_ERR define


170007 27-May-2007 kmacy

set IFF_OACTIVE to avoid hangs when the tx ring fills up


169994 25-May-2007 kmacy

add missed header


169990 25-May-2007 kmacy

update license headers


169988 25-May-2007 kmacy

add toe device header missed by previous commit


169978 25-May-2007 kmacy

(MFp4)
- upgrade to reflect state of 1.0.0.86
- move from firmware rev 3.2 to 4.0.0
- import driver bits for offload functionality
- remove binary distribution clause from top level files as it
runs counter to the intent of purely supporting the hardware

MFC after: 3 days


169053 26-Apr-2007 kmacy

Default to using a single queue as this is currently the only way to achieve
line rate


169052 26-Apr-2007 kmacy

Disable mbuf chain collapsing - it is currently causing an mbuf leak


168890 20-Apr-2007 kmacy

Free cluster if we fail to create the dmamap.

Fixes CID 1829
Found by: Coverity


168888 20-Apr-2007 kmacy

Eliminate CID 1842 by comparing against (type != EXT_MBUF) => refcnt != NULL


168886 20-Apr-2007 kmacy

Fix memory leak in m_collapse (CID 1843)

Found by: Coverity
Submitted by: jhb


168770 15-Apr-2007 kmacy

PHYS_TO_VM_PAGE requires explicit vm_page.h include on sparc64


168767 15-Apr-2007 mjacob

Use %j and args cast to uintmax_t to print bus_addr_t && length args.


168760 15-Apr-2007 kmacy

Add pmap includes needed by i386


168750 15-Apr-2007 kmacy

suck in more of busdma to enable more efficient mappings
kill redundant INVARIANTS check


168749 15-Apr-2007 kmacy

Add sysctl for disabling/enabling mbuf chain collapsing
remove map creation before calling bus_dmamap_load_mvec_sg


168748 15-Apr-2007 kmacy

Implement ZERO_COPY_SOCKETS check in a way that doesn't make LINT unhappy


168737 14-Apr-2007 kmacy

Add support for mbuf iovec in the TX path


168736 14-Apr-2007 kmacy

add reference count pointer to mbuf iovec
implement robust version of m_collapse
add support for sf_buf
add fix for m_iovappend
add calls to m_sanity under INVARIANTS
fix m_freem_vec to correctly travese the mbuf iovec chain


168650 12-Apr-2007 kmacy

restore sense to get_imm_packet

MFC after: 3 days


168646 12-Apr-2007 kmacy

switch over to per-txq dma tag to facilitate parallelism on TX

MFC after: 3 days


168644 12-Apr-2007 kmacy

explicitly check TSO flag
don't clear and then set M_PKTHDR, m_gethdr sets it correctly
improve error handling on m_gethdr failure

MFC after: 3 days


168642 12-Apr-2007 kmacy

Add ETHER_HDR_LEN to hardware accepted mtu

MFC after: 3 days


168620 11-Apr-2007 jhb

Fix m_freem_vec() to actually traverse the mbuf chain. This avoids
double free's and an infinite loop.

CID: 1834
Found by: Coverity Prevent (tm)


168540 09-Apr-2007 kmacy

throw sun4v into the check while we're at it


168539 09-Apr-2007 kmacy

busdma tags are opaque on all architectures except sparc64
for now simply don't compile/use on sparc64


168505 08-Apr-2007 kmacy

Add missing paren


168499 08-Apr-2007 kmacy

remove stale variable reference


168496 08-Apr-2007 kmacy

add busdma function for mapping mbuf iovecs
change m_collapse to return an error code


168491 08-Apr-2007 kmacy

Convert driver RX path over to using mbuf iovec


168490 08-Apr-2007 kmacy

Add driver private mbuf iovec support routines


168351 04-Apr-2007 kmacy

Make DMA tags per-queue to facilate parallel mappings
Defer mbuf allocation and initialization until after data has already been
received in a cluster

This reduces cpu utilization somewhat, but it only improves the rx path.
Recent changes to TCP appear to make us rate limited by the TX path.

This is the first step in reducing mbuf management overhead for manipulating
clusters.

MFC after: 3 days


167862 24-Mar-2007 kmacy

bus_size_t is a bad cross-architectural type with respect to printf, use uint32_t instead


167848 23-Mar-2007 kmacy

- Increase coalesce_nsecs
- commit fixes for the following coverity warnings: 1765, 1760, 1758, 1756


167847 23-Mar-2007 kmacy

commit missed change


167840 23-Mar-2007 kmacy

Check PCI-e link width to avoid foot shooting with 4x links

MFC after: 3 days


167769 21-Mar-2007 kmacy

move call to t3_prep_adapter earlier in attach before msi-x setup occurs

this works around the fact that pci_config_{save,restore} doesn't adequately
restore state for msi-x

MFC after: 3 days


167762 21-Mar-2007 kmacy

allocate 9 messages in all cases


167760 21-Mar-2007 kmacy

make MSI-X the default and allocate up to mp_ncpus queues per port

MFC after: 3 days


167746 20-Mar-2007 kmacy

Synchronize with version 1.0.071 of Chelsio's common code
(with the notable exception of improvements for using multiple TX queues)

This adds support for the T3B2 ASIC rev

Obtained from: Chelsio
MFC after: 3 days


167734 20-Mar-2007 kmacy

cxgb_stop is only called from cxgb_ioctl so:
- don't acquire port lock, already held in ioctl
- rename to cxgb_stop_locked
- switch callout_drain to callout_stop to avoid a hang from having the port lock held


167655 17-Mar-2007 kmacy

move inline function above use so that -O works


167561 14-Mar-2007 kmacy

#define L1_CACHE_BYTES for non-x86


167560 14-Mar-2007 kmacy

define prefetch as a no-op macro for non-x86 arches


167538 14-Mar-2007 kmacy

play it safe for now and go back to kicking off tx cleaning from the tx path


167528 14-Mar-2007 kmacy

#define memory barrier macros for the non-i386 && non-amd64 case


167527 14-Mar-2007 kmacy

remove unused code for recycling descriptors
kick tx cleaner from credit update function


167526 14-Mar-2007 kmacy

add cxgb_config.h to define values that are defined in the Makefile when compiled as a
module

move prefetch out of cxgb_sge.c into header under arch conditional compilation


167525 14-Mar-2007 kmacy

move taskqueue_enqueue of tx clean operation out of the start path


167524 14-Mar-2007 kmacy

make desc_reclaimable macro safe to arbitrary arguments


167515 14-Mar-2007 kmacy

Add firmware for cxgb


167514 14-Mar-2007 kmacy

First of several commits for driver support for the Chelsio T3B 10 Gigabit
Ethernet adapter.

Reviewed by: scottl, sam

For those interested in the preliminary performance work see below.

Plots of mxge vs. cxgb running netpipe:

blocksize vs. bandwidth:
http://www.fsmware.com/chelsio.random/bsvsbw.gif

blocksize vs. RTT:
First of several commits for driver support for the Chelsio T3B 10 Gigabit
Ethernet adapter.

Reviewed by: scottl, sam

For those interested in the preliminary performance work see below.

Plots of mxge vs. cxgb running netpipe:

blocksize vs. bandwidth:
http://www.fsmware.com/chelsio.random/bsvsbw.gif

blocksize vs. RTT:
http://www.fsmware.com/chelsio.random/bsvstime.gif

blocksize vs. RTT for block sizes <= 10kb:
http://www.fsmware.com/chelsio.random/bsvstime_10kb.gif
http://www.fsmware.com/chelsio.random/bsvstime_10kb3.gif