History log of /freebsd-11-stable/sys/dev/cxgbe/t4_netmap.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 355250 30-Nov-2019 np

MFC r354106:

cxgbe(4): Use correct FetchBurstMin values for T6.

Sponsored by: Chelsio Communications


# 346882 29-Apr-2019 np

MFC r338156, r338158-r338161, r338166.

r338156:
cxgbe(4): Avoid overflow while calculating channel rate.

Reported by: Coverity (CID 1008352)

r338158:
cxgbe(4): Check the RO bit properly before disabling relaxed ordering.

Reported by: Coverity (CID 1384286)

r338159:
cxgbe(4): Make it clear that VI_INIT_DONE implies vi->ntxq > 0, and so
rc will never be returned uninitialized.

Reported by: Coverity (CID 1394884). This is a false positive though.

r338160:
cxgbe(4): Do not leak memory in case of errors during VI initialization.

Reported by: Coverity (CID 1392026)

r338161:
cxgbe/tom: Make sure 'matched' is always initialized before use.

Reported by: Coverity (CID 1390894)

r338166:
cxgbe(4): Be explicit about ignoring the return value of cmpset in some
cases.

Reported by: Coverity (CIDs 1009398, 1009400, 1009401, 1357325, 1394783). All false positives.


# 346875 29-Apr-2019 np

MFC r337609:

cxgbe(4): Create two variants of service_iq, one for queues with
freelists and one for those without.

MFH: 3 weeks
Sponsored by: Chelsio Communications


# 344858 06-Mar-2019 jhb

MFC 341098: Add read-only sysctls for all tunables in the cxgbe(4) driver.


# 341481 04-Dec-2018 vmaffione

MFC r341145

cxgbe: revert r309725

After the fix contained in r341144, cxgbe does not need anymore
to set the IFCAP_NETMAP flag manually.

Reviewed by: np
Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D17987


# 341477 04-Dec-2018 vmaffione

MFC r339639

netmap: align codebase to the current upstream (sha 8374e1a7e6941)

Changelist:
- Move large parts of VALE code to a new file and header netmap_bdg.[ch].
This is useful to reuse the code within upcoming projects.
- Improvements and bug fixes to pipes and monitors.
- Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
handle differences between FreeBSD and Linux.
- Introduce some new helper functions to handle more host rings and fake
rings (netmap_all_rings(), netmap_real_rings(), ...)
- Added new sysctl to enable/disable hw checksum in emulated netmap mode.
- nm_inject: add support for NS_MOREFRAG

Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D17364


# 330307 03-Mar-2018 np

MFC r319506, r319872, r321063, r321103, r321179, r321390, r321435,
r321582, r321671, r322014, r322034, r322055, r322123, r322167, r322425,
r322549, r322914, r322960, r322962, r322964, r322985, r322990, r323006,
r323026, r323041, r323069, r323078, r323343, r323514, r323520, r324296,
r324379, r324386, r324443, r324945, r325596, r325680, r325880,
r325883-r325884, r325961, r326026, r326042, r327062, r327093, r327332,
r327528, r328420, and r328423.

r319506:
cxgbe(4): Update the statistics for compound tx work requests once per
work request, not once per frame.

r319872:
cxgbe(4): Do not request an FEC setting that the port does not support.

r321063:
cxgbe(4): Various link/media related improvements.

- Deal with changes to port_type, and not just port_mod when a
transceiver is changed. This fixes hot swapping of transceivers of
different types (QSFP+ or QSA or QSFP28 in a QSFP28 port, SFP+ or
SFP28 in a SFP28 port, etc.).

- Always refresh media information for ifconfig if the port is down.
The firmware does not generate tranceiver-change interrupts unless at
least one VI is enabled on the physical port. Before this change
ifconfig diplayed potentially stale information for ports that were
administratively down.

- Always recalculate and reapply L1 config on a transceiver change.

- Display PAUSE settings in ifconfig. The driver sysctls for this
continue to work as well.

r321103:
cxgbe(4): New ioctls to flash bootrom and boot config to the card.

r321179:
cxgbe/t4_tom: Log more details about the newly ESTABLISHED tid to the
trace buffer.

r321390:
cxgbe(4): Install the firmware bundled with the driver to the card if it
doesn't seem to have one. This lets the driver recover automatically
from incomplete firmware upgrades (panic, reboot, power loss, etc. in
the middle of an upgrade).

r321435:
cxgbe(4): Display some more TOE parameters related to retransmission
and keepalive in the sysctl MIB. Provide tunables to change some of
these parameters. These are supposed to be setup by the firmware so
these tunables are for experimentation only.

r321582:
cxgbe(4): Some updates to the common code.

- Updated register ranges.
- Helper routines for access to TP registers.
- Updated routine to read flash parameters.

r321671:
cxgbe/iw_cxgbe: Log the end point's history and flags to the trace
buffer just before it's freed.

r322014:
cxgbe(4): Initial import of the "collect" component of Chelsio unified
debug (cudbg) code, hooked up to the main driver via an ioctl.

The ioctl can be used to collect the chip's internal state in a
compressed dump file. These dumps can be decoded with the "view"
component of cudbg.

r322034:
cxgbe(4): Always use the first and not the last virtual interface
associated with a port in begin_synchronized_op.

r322055:
cxgbe(4): Allow the TOE timer tunables to be set with microsecond
precision. These timers are already displayed in microseconds in the
sysctl MIB. Add variables to track these tunables while here.

r322123:
cxgbe(4): Avoid a NULL dereference that would occur during module unload
if there were problems earlier during attach.

r322167:
cxgbe(4): Add the T6 and T5 Unified Wire configuration files to the
kernel, just like for T4, when the driver is compiled into the kernel.

r322425:
cxgbe(4): Save the last reported link parameters and compare them with
the current state to determine whether to generate a link-state change
notification. This fixes a bug introduced in r321063 that caused the
driver to sometimes skip these notifications.

r322549:
cxgbe/t4_tom: Use correct name for the ISS-valid bit in options2.

r322914:
cxgbe(4): Dump the mailbox contents in the same format as CH_DUMP_MBOX.

r322960:
cxgbe(4): Verify that the driver accesses the firmware mailbox in a
thread-safe manner.

r322962:
cxgbe(4): Remove write only variable from t4_port_init.

r322964:
cxgbe(4): vi_mac_funcs should include the base Ethernet function. It is
already used in the driver as if it does.

r322985:
cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI). All autonomous VIs that share a port share the
same media.

r322990:
cxgbe(4): Do not access the mailbox without appropriate locks while
creating hardware VIs.

This fixes a bad race on systems with hw.cxgbe.num_vis > 1.

r323006:
cxgbe(4): Update T6/T5/T4 firmwares to 1.16.59.0.

r323026:
cxgbe(4): Zero out the memory allocated for the debug dump.
cudbg_collect seems to expect it this way.

r323041:
cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+). The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

r323069:
cxgbe/t4_tom: Add a knob to select the congestion control algorigthm
used by the TOE hardware for fully offloaded connections. The knob
affects new connections only.

r323078:
cxgbe/t4_tom: There may not be a tid to update if the connection isn't
established.

r323343:
cxgbe(4): Fix a couple of problems in the sge_wrq data path.

- start_wrq_wr must not drain the wr_list if there are incomplete_wrs
pending. This can happen when a t4_wrq_tx runs between two
start_wrq_wr.

- commit_wrq_wr must examine the cookie's pidx and ndesc with the
queue's lock held. Otherwise there is a bad race when incomplete WRs
are being completed and commit_wrq_wr for the WR that is ahead in the
queue updates the next incomplete WR's cookie's pidx/ndesc but the
commit_wrq_wr for the second one is using stale values that it read
without the lock.

r323514:
cxgbetool(8): mode must be specified when creating the dump file.

r323520:
cxgbe(4): Ignore capabilities that depend on TOE when the firmware
reports TOE is not available.

r324296:
cxgbe(4): Provide knobs to set the holdoff parameters of TOE rx queues
separately from NIC rx queues instead of using the same parameters for
both types of queues.

r324379:
cxgbetool(8): Do not create a large file devoid of useful content when
the dumpstate ioctl fails. Make the file world-readable while here.

r324386:
cxgbe(4): Update T6, T5, and T4 firmwares to 1.16.63.0.

r324443:
cxgbetool(8): Do not close uninitialized fd on malloc failure.

r324945:
cxgbe(4): Read the MPS buffer group map from the firmware as it could be
different from hardware defaults. The congestion channel map, which is
still fixed, needs to be tracked separately now. Change the congestion
setting for TOE rx queues to match the drivers on other OSes while here.

r325596:
cxgbe(4): Do not request settings not supported by the port.

r325680:
cxgbe(4): Excluce mdi from the check against port capabilities.

r325880:
cxgbe(4): Combine all _10g and _1g tunables and drop the suffix from
their names. The finer-grained knobs weren't practically useful.

r325883:
cxgbe(4): Sanitize t4_num_vis during MOD_LOAD like all other t4_*
tunables. Add num_vis to the intrs_and_queues structure as it affects
the number of interrupts requested and queues created. In future
cfg_itype_and_nqueues might lower it incrementally instead of going
straight to 1 when enough interrupts aren't available.

r325884:
cxgbe(4): Remove rsrv_noflowq from intrs_and_queues structure as it does
not influence or get affected by the number of interrupts or queues.

r325961:
cxgbe(4): Add core Vdd to the sysctl MIB.

r326026:
cxgbe(4): Add a custom board to the device id list.

r326042:
cxgbe(4): Fix unsafe mailbox access in cudbg.

r327062:
cxgbe(4): Read the MFG diags version from the VPD and make it available
in the sysctl MIB.

r327093:
cxgbe(4): Do not forward interrupts to queues with freelists. This
leaves the firmware event queue (fwq) as the only queue that can take
interrupts for others.

This simplifies cfg_itype_and_nqueues and queue allocation in the driver
at the cost of a little (never?) used configuration. It also allows
service_iq to be split into two specialized variants in the future.

r327332:
cxgbe(4): Reduce duplication by consolidating minor variations of the
same code into a single routine.

r327528:
cxgbe(4): Add a knob to enable/disable PCIe relaxed ordering. Disable it by
default when running on Intel CPUs.

r328420:
cxgbe(4): Do not display harmless warning in non-debug builds.

r328423:
cxgbe(4): Accept old names of a couple of tunables.

Sponsored by: Chelsio Communications


# 318825 24-May-2017 np

MFC r309725:

cxgbe(4): netmap does not set IFCAP_NETMAP in an ifnet's if_capabilities
any more (since r307394). Do it in the driver instead.

Sponsored by: Chelsio Communications


# 309560 05-Dec-2016 jhb

MFC 305695,305696,305699,305702,305703,305713,305715,305827,305852,305906,
305908,306062,306063,306137,306138,306206,306216,306273,306295,306301,
306465,309302:
Add support for adapters using the Terminator T6 ASIC.

305695:
cxgbe(4): Set up fl_starve_threshold2 accurately for T6.

305696:
cxgbe(4): Use correct macro for header length with T6 ASICs. This
affects the transmit of the VF driver only.

305699:
cxgbe(4): Update the pad_boundary calculation for T6, which has a
different range of boundaries.

305702:
cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.

305703:
cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6.

305713:
cxgbe(4): Add support for additional port types and link speeds.

305715:
cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

305827:
cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields
to use in tx work requests.

305852:
cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will
come up as 't6nex' nexus devices with 'cc' ports hanging off them.

The T6 firmware and configuration files will be added as soon as they
are released. For now the driver will try to work with whatever
firmware and configuration is on the card's flash.

305906:
cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.

305908:
cxgbe/t4_tom: Update the active/passive open code to support T6. Data
path works as-is.

306062:
cxgbe(4): Show wcwr_stats for T6 cards.

306063:
cxgbe(4): Setup congestion response for T6 rx queues.

306137:
cxgbetool: Add T6 support to the SGE context decoder.

306138:
Fix typo.

306206:
cxgbe(4): Catch up with the different layout of WHOAMI in T6.

Note that the code moved below t4_prep_adapter() as part of this change
because now it needs a working chip_id().

306216:
cxgbe(4): Fix the output of the "tids" sysctl on T6.

306273:
cxgbe(4): Fix netmap with T6, which doesn't encapsulate SGE_EGR_UPDATE
message inside a FW_MSG. The base NIC already deals with updates in
either form.

306295:
cxgbe(4): Support SIOGIFXMEDIA so that ifconfig displays correct media
for 25Gbps and 100Gbps ports. This should have been part of r305713,
which is when the driver first started reporting extended media types.

306301:
cxgbe(4): Use the port's top speed to figure out whether it is "high
speed" or not (for the purpose of calculating the number of queues etc.)
This does the right thing for 25Gbps and 100Gbps ports.

306465:
cxgbe(4): Claim the T6 -DBG card.

309302:
cxgbe(4): Include firmware for T6 cards in the driver. Update all
firmwares to 1.16.12.0.

Sponsored by: Chelsio Communications


# 306664 03-Oct-2016 jhb

MFC 303522,303647,303860,303880,304168,304169,304170,304479,304485,305549:
Chelsio T4/T5 VF driver.

303522:
Various fixes to the t4/5nex character device.

- Remove null open/close methods.
- Don't set d_flags to 0 explicitly.
- Remove t5_cdevsw as the .d_name member isn't really used and doesn't
warrant a separate cdevsw just for the name.
- Use ENOTTY as the error value for an unknown ioctl request.
- Use make_dev_s() to close race with setting si_drv1.

303647:
Store the offset of the KDOORBELL and GTS registers in the softc.

VF devices use a different register layout than PF devices. Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

303860:
Reserve an adapter flag IS_VF to mark VF devices vs PF devices.

303880:
Track the base absolute ID of ingress and egress queues.

Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers. For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero. VF devices have a non-zero base
absolute ID and require this change. While here, export the absolute ID
of egress queues via a sysctl.

304168:
Make SGE parameter handling more VF-friendly.

Add fields to hold the SGE control register and free list buffer sizes to
the sge_params structure. Populate these new fields in
t4_init_sge_params() for PF devices and change t4_read_chip_settings() to
pull these values out of the params structure instead of reading
registers directly. This will permit t4_read_chip_settings() to be reused
for VF devices which cannot read SGE registers directly.

While here, move the call to t4_init_sge_params() to
get_params__post_init(). The VF driver will populate the SGE parameters
structure via a different method before calling t4_read_chip_settings().

304169:
Update mailbox writes to work with VF devices.

- Use alternate register locations for the data and control registers for
VFs.
- Do a dummy read to force the writes to the mailbox data registers to
post before the write to the control register on VFs.
- Do not check the PCI-e firmware register for errors on VFs.

304170:
Add support for register dumps on VF devices.

- Add handling of VF register sets to t4_get_regs_len() and t4_get_regs().
- While here, use t4_get_regs_len() in the ioctl handler for regdump
instead of inlining it.

304479:
Add structures for VF-specific adapter parameters.

While here, mark which parameters are PF-specific and which are
VF-specific.

304485:
Reorder sysctls so that nodes shared with the VF driver are added first.

This permits a single early return for VF devices in the routines that
add sysctl nodes.

305549:
Chelsio T4/T5 VF driver.

The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters. The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device. It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages). This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request. In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices. Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics. In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Sponsored by: Chelsio Communications


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 302110 23-Jun-2016 np

cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel. The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.). It did not have normal
tx/rx but had a specialized netmap-only data path. r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support. The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable. This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface. nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only. No
need to bring them up. The vcxl interfaces are completely independent
and everything should just work.
---------------------

Approved by: re@ (gjb@)
Relnotes: Yes
Sponsored by: Chelsio Communications


# 298848 30-Apr-2016 pfg

sys: Make use of our rounddown() macro when sys/param.h is available.

No functional change.


# 296478 07-Mar-2016 np

cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code. Use these per-adapter values instead of globals.

Sponsored by: Chelsio Communications


# 296383 04-Mar-2016 np

cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
work with T6. Most of the changes are in the decoders for the CIM
logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications


# 291665 02-Dec-2015 jhb

Add support for configuring additional virtual interfaces (VIs) on a port.

Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port. This change allows additional non-netmap
interfaces to be configured on each port. Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.

Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).

T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).

One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.

The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.

The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats. There is currently no way to
clear the stats of an individual VI.

Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio


# 285349 10-Jul-2015 luigi

Sync netmap sources with the version in our private tree.
This commit contains large contributions from Giuseppe Lettieri and
Stefano Garzarella, is partly supported by grants from Verisign and Cisco,
and brings in the following:

- fix zerocopy monitor ports and introduce copying monitor ports
(the latter are lower performance but give access to all traffic
in parallel with the application)

- exclusive open mode, useful to implement solutions that recover
from crashes of the main netmap client (suggested by Patrick Kelsey)

- revised memory allocator in preparation for the 'passthrough mode'
(ptnetmap) recently presented at bsdcan. ptnetmap is described in
S. Garzarella, G. Lettieri, L. Rizzo;
Virtual device passthrough for high speed VM networking,
ACM/IEEE ANCS 2015, Oakland (CA) May 2015
http://info.iet.unipi.it/~luigi/research.html

- fix rx CRC handing on ixl

- add module dependencies for netmap when building drivers as modules

- minor simplifications to device-specific routines (*txsync, *rxsync)

- general code cleanup (remove unused variables, introduce macros
to access rings and remove duplicate code,

Applications do not need to be recompiled, unless of course
they want to use the new features (monitors and exclusive open).

Those willing to try this code on stable/10 can just update the
sys/dev/netmap/*, sys/net/netmap* with the version in HEAD
and apply the small patches to individual device drivers.

MFC after: 1 month
Sponsored by: (partly) Verisign, Cisco


# 285221 06-Jul-2015 np

cxgbe(4): Add a new knob that controls the congestion response of netmap
rx queues. The default is to drop rather than backpressure.

This decouples the congestion settings of NIC and netmap rx queues.

MFC after: 3 days


# 284988 30-Jun-2015 np

cxgbe(4): request an automatic tx update when a netmap tx queue idles.
The NIC tx queues already do this.

MFC after: 1 week
Differential Revision:


# 284007 04-Jun-2015 np

cxgbe: set the minimum burst size when fetching fl buffers to 128B for
netmap rx queues too. This should have gone in as part of r283858.


# 279701 06-Mar-2015 np

cxgbe(4): experimental rx packet sink for netmap queues. This is not
intended for general use.

MFC after: 1 month


# 279700 06-Mar-2015 np

cxgbe(4): knobs to experiment with the interrupt coalescing timer for
netmap rx queues, and the "batchiness" of rx updates sent to the chip.

These knobs will probably become per-rxq in the near future and will be
documented only after their final form is decided.

MFC after: 1 month


# 279691 06-Mar-2015 np

cxgbe(4): provide the correct size of freelists associated with netmap
rx queues to the chip. This will fix many problems with native netmap
rx on ncxl/ncxgbe interfaces.

MFC after: 1 week


# 279251 24-Feb-2015 np

cxgbe(4): allow tx hardware checksumming on the netmap interface.

It is disabled by default but users can set IFCAP_TXCSUM on the
netmap ifnet (ifconfig ncxl0 txcsum) to override netmap and force
the hardware to calculate and insert proper IP and L4 checksums in
outbound frames.

MFC after: 2 weeks


# 279246 24-Feb-2015 np

cxgbe(4): set up congestion management for netmap rx queues.

The hw.cxgbe.cong_drop knob controls the response of the chip when
netmap queues are congested.


# 279245 24-Feb-2015 np

cxgbe(4): do not set the netmap rxq interrupts on a hair-trigger.

MFC after: 2 weeks


# 279244 24-Feb-2015 np

cxgbe(4): wait for the hardware to catch up before destroying a netmap txq.

MFC after: 2 weeks


# 279243 24-Feb-2015 np

cxgbe(4): request an automatic tx update when a netmap txq idles.

MFC after: 2 weeks


# 271328 09-Sep-2014 np

Whitespace nit.

MFC after: 1 week


# 270063 16-Aug-2014 luigi

Update to the current version of netmap.
Mostly bugfixes or features developed in the past 6 months,
so this is a 10.1 candidate.

Basically no user API changes (some bugfixes in sys/net/netmap_user.h).

In detail:

1. netmap support for virtio-net, including in netmap mode.
Under bhyve and with a netmap backend [2] we reach over 1Mpps
with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode.

2. (kernel) add support for multiple memory allocators, so we can
better partition physical and virtual interfaces giving access
to separate users. The most visible effect is one additional
argument to the various kernel functions to compute buffer
addresses. All netmap-supported drivers are affected, but changes
are mechanical and trivial

3. (kernel) simplify the prototype for *txsync() and *rxsync()
driver methods. All netmap drivers affected, changes mostly mechanical.

4. add support for netmap-monitor ports. Think of it as a mirroring
port on a physical switch: a netmap monitor port replicates traffic
present on the main port. Restrictions apply. Drive carefully.

5. if_lem.c: support for various paravirtualization features,
experimental and disabled by default.
Most of these are described in our ANCS'13 paper [1].
Paravirtualized support in netmap mode is new, and beats the
numbers in the paper by a large factor (under qemu-kvm,
we measured gues-host throughput up to 10-12 Mpps).

A lot of refactoring and additional documentation in the files
in sys/dev/netmap, but apart from #2 and #3 above, almost nothing
of this stuff is visible to other kernel parts.

Example programs in tools/tools/netmap have been updated with bugfixes
and to support more of the existing features.

This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline.

A lot of this code has been contributed by my colleagues at UNIPI,
including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella.

MFC after: 3 days.


# 269413 02-Aug-2014 np

cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell
address of an egress queue.

MFC after: 2 weeks


# 269411 01-Aug-2014 np

cxgbe(4): minor optimizations in ingress queue processing.

Reorganize struct sge_iq. Make the iq entry size a compile time
constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.

MFC after: 2 weeks


# 267757 22-Jun-2014 np

cxgbe(4): Update the bundled T4 and T5 firmwares to versions 1.11.27.0.

Obtained from: Chelsio
MFC after: 3 days


# 266757 27-May-2014 np

cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards.
Netmap gets its own hardware-assisted virtual interface and won't take
over or disrupt the "normal" interface in any way. You can use both
simultaneously.

For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface
(note the 'n' prefix) in the hardware to accompany each cxl<N>
interface. These two ifnet's per port share the same wire but really
are separate interfaces in the hardware and software. Each gets its own
L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You
should run netmap on the 'n' interfaces only, that's what they are for.

With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port
of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now.
Single port receive is at 33Mpps but this is very much a work in
progress. I expect it to be closer to 40Mpps once done. In any case
the current effort can already saturate multiple 10G ports of a T5 card
at the smallest legal packet size. T4 gear is totally untested.

trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43:ab:cd:ef
881.952141 main [1621] interface is ncxl0
881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0
881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0
881.962540 main [1804] mapped 334980KB at 0x801dff000
Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43:ab:cd:ef)
881.962562 main [1882] Sending 512 packets every 0.000000000 s
881.962563 main [1884] Wait 2 secs for phy reset
884.088516 main [1886] Ready...
884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1
884.088607 sender_body [996] start
884.093246 sender_body [1064] drop copy
885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec)
886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec)
887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec)
888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec)
889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec)
890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec)
891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec)
892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec)
893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec)
894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec)
895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec)
896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec)
...

Relnotes: Yes
Sponsored by: Chelsio Communications.