362511 |
22-Jun-2020 |
freqlabs |
MFC r362201:
Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in various NIC drivers.
Reviewed by: hselasky, np, gallatin, jpaetzel Approved by: mav (mentor) Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25120 |
355253 |
30-Nov-2019 |
np |
MFC r354742:
cxgbev(4): Catch up with the pciids in the PF driver.
Sponsored by: Chelsio Communications |
355252 |
30-Nov-2019 |
np |
MFC r354522:
cxgbe(4): Query Vdd from the firmware if its last known value is 0.
TVSENSE may not be ready by the time t4_fw_initialize returns and the firmware returns 0 if the driver asks for the Vdd before the sensor is ready.
Sponsored by: Chelsio Communications |
355250 |
30-Nov-2019 |
np |
MFC r354106:
cxgbe(4): Use correct FetchBurstMin values for T6.
Sponsored by: Chelsio Communications |
355249 |
30-Nov-2019 |
np |
MFC r351524:
cxgbe/t4_tom: Limit work requests with immediate payload to a single descriptor. The per-tid tx credits are in demand during active Tx and it's best not to use too many just for payload.
Sponsored by: Chelsio Communications |
355246 |
30-Nov-2019 |
np |
MFC r351446:
cxgbe/t4_tom: Any invalid scaling factor in the hardware's wsf field implies that window scaling is not in use.
Sponsored by: Chelsio Communications |
355245 |
30-Nov-2019 |
np |
MFC r351445:
whitespace nit. |
355244 |
30-Nov-2019 |
np |
MFC r349956:
cxgbe(4): Completely ignore all top level interrupts that are not enabled.
The driver used to log any non-zero cause and when running with a single line interrupt it would spam the console/logs with reports of interrupts that are of no interest to anyone.
Sponsored by: Chelsio Communications |
355243 |
30-Nov-2019 |
np |
MFC r349865:
cxgbe(4): Use the simplest configuration possible when falling back from the default configuration.
Sponsored by: Chelsio Communications |
355242 |
30-Nov-2019 |
np |
MFC r349500:
cxgbe/t4_tom: Fix regression in t_maxseg usage within t4_tom.
t_maxseg was changed in r293284 to not have any adjustment for TCP timestamps. t4_tom inadvertently went back to pre-r293284 semantics in r332506.
Sponsored by: Chelsio Communications |
355240 |
30-Nov-2019 |
np |
MFC r349499:
cxgbe/iw_cxgbe: Remove unused field from the endpoint structure. |
355238 |
30-Nov-2019 |
np |
MFC r349242:
cxgbe/t4_tom: DDP_DEAD is a ddp flag and not a toepcb flag.
The driver was in effect setting TPF_ABORT_SHUTDOWN on the toepcb instead of what was intended.
Sponsored by: Chelsio Communications |
354099 |
25-Oct-2019 |
jhb |
MFC 353369: Remove adapters from t4_list earlier during detach.
This ensures the clip task won't race with t4_destroy_clip_table.
While here, make some mutex destroys unconditional since attach always initializes them.
Sponsored by: Chelsio Communications |
354098 |
25-Oct-2019 |
jhb |
MFC 353323: Set the FID field in lookaside crypto requests to the rx queue ID.
The PCI block in the adapter requires this field to be set to a valid queue ID. It is not clear why it did not fail on all machines, but the effect was that crypto operations reading input data via DMA failed with an internal PCI read error on machines with 128G or more of RAM. |
351236 |
19-Aug-2019 |
jhb |
MFC 349467: Hold an explicit reference on the socket for the aiotx task.
Previously, the aiotx task relied on the aio jobs in the queue to hold a reference on the socket. However, when the last job is completed, there is nothing left to hold a reference to the socket buffer lock used to check if the queue is empty. In addition, if the last job on the queue is cancelled, the task can run with no queued jobs holding a reference to the socket buffer lock the task uses to notice the queue is empty.
Fix these races by holding an explicit reference on the socket when the task is queued and dropping that reference when the task completes. |
351228 |
19-Aug-2019 |
jhb |
MFC 348791: Fix debug trace after removal of pdu_overhead. |
348704 |
05-Jun-2019 |
np |
MFC r348491:
cxgbe/t4_tom: adjust the hardware receive window to match changes to the receive sockbuf's high water mark.
Calculate rx credits on the spot instead of tracking sbused/sb_cc and rx_credits in the toepcb. The previous method worked when the high water mark changed due to SB_AUTOSIZE but not when it was adjusted directly (for example, by the soreserve in nfsrvd_addsock).
This fixes a connection hang while running iozone over an NFS mounted share where nfsd's TCP sockets are being handled by t4_tom.
Sponsored by: Chelsio Communications
Approved by: re@ (gjb@) |
346970 |
30-Apr-2019 |
np |
MFC r342208:
cxgbe/t4_tom: fixes for issues on the passive open side.
- Fix PR 227760 by getting the TOE to respond to the SYN after the call to toe_syncache_add, not during it. The kernel syncache code calls syncache_respond just before syncache_insert. If the ACK to the syncache_respond is processed in another thread it may run before the syncache_insert and won't find the entry. Note that this affects only t4_tom because it's the only driver trying to insert and expand syncache entries from different threads.
- Do not leak resources if an embryonic connection terminates at SYN_RCVD because of L2 lookup failures.
- Retire lctx->synq and associated code because there is never a need to walk the list of embryonic connections associated with a listener. The per-tid state is still called a synq entry in the driver even though the synq itself is now gone.
PR: 227760 Sponsored by: Chelsio Communications |
346967 |
30-Apr-2019 |
np |
MFC r345334:
cxgbe(4): Treat the viid as an opaque identifier.
Recent firmwares prefer to use a different format for viid internally and this change allows them to do so.
Sponsored by: Chelsio Communications |
346966 |
30-Apr-2019 |
np |
MFC r344654:
cxgbe(4): Request high priority filter support explicitly, as required by recent firmwares.
Sponsored by: Chelsio Communications |
346964 |
30-Apr-2019 |
np |
MFC r343889, r344519, r344682, r344719
r343889: cxgbev(4): Initialize debug_flags from the environment like in the PF driver.
r344519: cxgbe(4): Use correct port_info in the call to is_bt().
This fixes a panic during configuration if the tx channel of a port isn't the same as its port id.
Reported by: Fabrice Bruel Sponsored by: Chelsio Communications
r344682: cxgbe(4): Don't forget to report link state to the kernel if the link is already up at attach.
Reported by: Fabrice Bruel @ Orange Business Service Sponsored by: Chelsio Communications
r344719: cxgbev(4): Enable 32b port capabilities in the VF driver.
Sponsored by: Chelsio Communications |
346963 |
30-Apr-2019 |
np |
MFC r343666, r343861-r343862, r343923, r343968, r345660, r345810
r343666: cxgbe(4): Improved error reporting and diagnostics.
"slow" interrupt handler: - Expand the list of INT_CAUSE registers known to the driver. - Add decode information for many more bits but decouple it from the rest of intr_info so that it is entirely optional. - Call t4_fatal_err exactly once, and from the top level PL intr handler.
t4_fatal_err: - Use t4_shutdown_adapter from the common code to stop the adapter. - Stop servicing slow interrupts after the first fatal one.
Driver/firmware interaction: - CH_DUMP_MBOX: note whether the mailbox being dumped is a command or a reply or something else. - Log the raw value of pcie_fw for some errors. - Use correct log levels (debug vs. error).
Sponsored by: Chelsio Communications
r343861: cxgbe(4): Auto-dump the device log on a mailbox timeout or when the firmware reports an error in pcie_fw.
Sponsored by: Chelsio Communications
r343862: cxgbe(4): Auto-dump the CIM block's logic analyzer on a TIMER0 interrupt.
Sponsored by: Chelsio Communications
r343923: cxgbe(4): Delay the panic due to a fatal error by 30s.
This lets information logged by the interrupt handler reach the system log before the system goes down.
r343968: cxgbe(4): Ignore unused interrupts.
Sponsored by: Chelsio Communications
r345660: cxgbe(4): Count and clear interrupts generated at the software's request.
An interrupt can be requested by setting the F_SWINT bit in PL_PF_CTL.
Sponsored by: Chelsio Communications
r345810: cxgbe(4): Add a flag to indicate that bits in interrupt cause but not in interrupt enable are not fatal.
The firmware sets up all the interrupt enables based on run time configuration, which means the information in the enables is more accurate than what's compiled into the driver. This change also allows the fatal bits to be updated without any changes in the driver in some cases.
Sponsored by: Chelsio Communications |
346962 |
30-Apr-2019 |
np |
MFC r343539:
cxgbe(4): Add adapter information to messages logged by the OS-agnostic code in t4_hw.c.
Sponsored by: Chelsio Communications |
346954 |
30-Apr-2019 |
np |
MFC r343269, r346567
r343269: cxgbe(4): Allow negative values in hw.cxgbe.fw_install and take them to mean that the driver should taste the firmware in the KLD and use that firmware's version for all its fw_install checks.
The driver gets firmware version information from compiled-in values by default and this change allows custom (or older/newer) firmware modules to be used with the stock driver.
There is no change in default behavior.
Sponsored by: Chelsio Communications
r346567: cxgbe(4): Make sure bundled_fw is always initialized before use.
This fixes a bug that prevented the driver from auto-flashing the firmware when it didn't see one on the card. This feature was introduced in r321390 and this bug was introduced in r343269.
Reported by: gallatin@ Sponsored by: Chelsio Communications |
346952 |
30-Apr-2019 |
np |
MFC r343264:
cxgbe(4): Use a truncated firmware header for version checks. All the version numbers are towards the begining of the header.
Sponsored by: Chelsio Communications |
346951 |
30-Apr-2019 |
np |
MFC r343233:
cxgbe(4): Clear the reply-pending status of a hashfilter when the reply indicates an error. Also, do not remove it twice from the hf list in this case.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communicatons |
346950 |
30-Apr-2019 |
np |
MFC r343569, r345307
r343569: cxgbe/iw_cxgbe: Fix an address calculation in the memory registration code that was added in r342266.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r345307: iw_cxgbe: Remove unused smac_idx from the ep structure.
Submitted by: Krishnamraju Eraparaju @ Chelsio |
346949 |
30-Apr-2019 |
np |
MFC r342954:
cxgbe(4): Move some INTx specific code to a more appropriate place. |
346948 |
30-Apr-2019 |
np |
MFC r342758:
cxgbe(4): Clear FW_OK if the firmware reports an error.
Sponsored by: Chelsio Communications |
346947 |
30-Apr-2019 |
np |
MFC r342356:
Remove unused macros from t4_tom.h. |
346946 |
30-Apr-2019 |
np |
MFC r342234:
cxgbe(4): Do not issue mbox commands after t4_fw_bye.
Sponsored by: Chelsio Communications |
346945 |
30-Apr-2019 |
np |
MFC r341654:
cxgbe(4): Get Linux cxgb4vf working in bhyve VMs with VFs passed through.
cxgb4vf doesn't own the buffer size list but still expects the first two entries to be 4K and some power of 2 respectively. The BSD cxgbe doesn't care where its preferred buffer sizes are as long as they're in the list somewhere, so just move its entries towards the end as a workaround.
Sponsored by: Chelsio Communications |
346942 |
30-Apr-2019 |
np |
MFC r341620:
cxgbe(4): Fall back to a basic configuration in case of any error during card initialization. This is an expanded version of r333682.
Break up prep_firmware into simpler routines while here. Load the firmware/config KLD only if needed.
Sponsored by: Chelsio Communications |
346940 |
30-Apr-2019 |
np |
MFC r338954, r340651, r344524, r345083.
r338954: cxgbe(4): Enable support for per-connection rate limiting in the default firmware configuration files.
Approved by: re@ (gjb@) Sponsored by: Chelsio Communications
r340651: cxgbe(4): Update T4/5/6 firmwares to 1.22.0.3.
Obtained from: Chelsio Communications Sponsored by: Chelsio Communications
r344524: cxgbe(4): Updates to the default and hashfilter configurations.
- Do not use nvf = 4 as it is not really supported by the firmware. Firmwares 1.23.3.0 and above will ignore it silently. - Increase PF4's share of the VIs and let it use all of the RSS table.
Sponsored by: Chelsio Communications
r345083: cxgbe(4): Update T4/5/6 firmwares to 1.23.0.0.
Obtained from: Chelsio Communications Sponsored by: Chelsio Communications |
346934 |
29-Apr-2019 |
np |
MFC r341172, r341270. t4_clip.c had to be manually adjusted because Concurrency Kit is not available in stable/11.
r341172: Move CLIP table handling out of TOM and into the base driver.
- Store the clip table in 'struct adapter' instead of in the TOM softc. - Init the clip table during attach and teardown during detach. - While here, add a dev.<nexus>.<unit>.misc.clip sysctl to dump the CLIP table.
This does mean that we update the clip table even if TOE is not enabled, but non-TOE things need the CLIP table anyway.
Reviewed by: np, Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D18010
r341270: Make most of the CLIP code conditional on #ifdef INET6.
This fixes builds of kernels without INET6 such as LINT-NOINET6.
Reported by: arybchik Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D18384 |
346930 |
29-Apr-2019 |
np |
MFC r339700:
cxgbe(4): new sysctl to display the start of the RSS region for a VI.
dev.<ifname>.<inst>.rss_base
For example: dev.cc.0.rss_base: 0 dev.cc.1.rss_base: 128 dev.vcc.0.rss_base: 256 dev.vcc.1.rss_base: 384
Sponsored by: Chelsio Communications |
346928 |
29-Apr-2019 |
np |
MFC r339628, r339965
r339628: cxgbe(4): improve the accuracy of various TSO limits reported to the kernel.
Sponsored by: Chelsio Communications
r339965: cxgbe(4): Report a reasonable non-zero if_hw_tsomaxsegsize to the kernel.
This reverts an accidental change that snuck in with r339628.
Sponsored by: Chelsio Communications |
346923 |
29-Apr-2019 |
np |
MFC r339891, r340063, r342266, r342270, r342272, r342288-r342289
r339891: cxgbe/iw_cxgbe: Install the socket upcall before calling soconnect to ensure that it always runs when soisconnected does.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r340063: cxgbe/iw_cxgbe: Suppress spurious "Unexpected streaming data ..." messages.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r342266: cxgbe/iw_cxgbe: Use DSGLs to write to card's memory when appropriate.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r342270: cxgbe/iw_cxgbe: Add a knob for testing that lets iWARP connections cycle through 4-tuples quickly.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r342272: cxgbe/iw_cxgbe: Use -ve errno when interfacing with linuxkpi/OFED.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r342288: cxgbe/iw_cxgbe: Do not terminate CTRx messages with \n.
r342289: cxgbe/iw_cxgbe: Remove redundant CTRs from c4iw_alloc/c4iw_rdev_open. This information is readily available elsewhere.
Sponsored by: Chelsio Communications |
346922 |
29-Apr-2019 |
np |
MFC r339667:
cxgbe/iw_cxgbe: save the ep in the driver-private provider_data field.
Submitted By: Lily Wang @ Netapp |
346921 |
29-Apr-2019 |
np |
MFC r339626:
cxgbe(4): Use automatic cidx updates with ofld and ctrl queues.
The bits that explicitly request cidx updates do not work reliably with all possible WRs that can be sent over the queue. The F_FW_WR_EQUIQ requests that still remain may also have to be replaced with explicit credit flush WRs in the future.
Sponsored by: Chelsio Communications |
346916 |
29-Apr-2019 |
np |
MFC r338940:
cxgbe(4): Treat base/end of firmware parameters as signed integers when figuring out whether the range is valid or not. |
346915 |
29-Apr-2019 |
np |
MFC r338874:
cxgbe(4): Reuse existing "switching" L2T entries when possible. |
346914 |
29-Apr-2019 |
np |
MFC r338669:
cxgbe(4): Use the correct number of parameters when querying the tid range for hashfilters. |
346913 |
29-Apr-2019 |
np |
MFC r338355:
cxgbe/tom: Unregister shared CPL handlers on module unload. This fixes a panic with INVARIANTS that occurs when t4_tom is unloaded and reloaded. |
346883 |
29-Apr-2019 |
np |
MFC r338218:
cxgbev(4): Updates to the VF driver to cope with recent ifmedia and ctrlq changes in the base driver.
Sponsored by: Chelsio Communications |
346882 |
29-Apr-2019 |
np |
MFC r338156, r338158-r338161, r338166.
r338156: cxgbe(4): Avoid overflow while calculating channel rate.
Reported by: Coverity (CID 1008352)
r338158: cxgbe(4): Check the RO bit properly before disabling relaxed ordering.
Reported by: Coverity (CID 1384286)
r338159: cxgbe(4): Make it clear that VI_INIT_DONE implies vi->ntxq > 0, and so rc will never be returned uninitialized.
Reported by: Coverity (CID 1394884). This is a false positive though.
r338160: cxgbe(4): Do not leak memory in case of errors during VI initialization.
Reported by: Coverity (CID 1392026)
r338161: cxgbe/tom: Make sure 'matched' is always initialized before use.
Reported by: Coverity (CID 1390894)
r338166: cxgbe(4): Be explicit about ignoring the return value of cmpset in some cases.
Reported by: Coverity (CIDs 1009398, 1009400, 1009401, 1357325, 1394783). All false positives. |
346878 |
29-Apr-2019 |
np |
MFC r337873:
cxgbe(4): Use VLAN_TRUNKDEV instead of private cookie to figure out the parent of a VLAN ifnet.
Sponsored by: Chelsio Communications |
346877 |
29-Apr-2019 |
np |
MFC r337830:
cxgbe(4): Use two hashes instead of a table to keep track of hashfilters. Two because the driver needs to look up a hashfilter by its 4-tuple or tid.
A couple of fixes while here: - Reject attempts to add duplicate hashfilters. - Do not assume that any part of the 4-tuple that isn't specified is 0. This makes it consistent with all other mandatory parameters that already require explicit user input. |
346876 |
29-Apr-2019 |
np |
MFC r337659:
cxgbe(4): Move all control queues to the adapter.
There used to be one control queue per adapter (the mgmtq) that was initialized during adapter init and one per port that was initialized later during port init. This change moves all the control queues (one per port/channel) to the adapter so that they are initialized during adapter init and are available before any port is up. This allows the driver to issue ctrlq work requests over any channel without having to bring up any port. |
346875 |
29-Apr-2019 |
np |
MFC r337609:
cxgbe(4): Create two variants of service_iq, one for queues with freelists and one for those without.
MFH: 3 weeks Sponsored by: Chelsio Communications |
346874 |
29-Apr-2019 |
np |
MFC r337538, r337987
r337538: cxgbe(4): Add support for high priority filters on T6+. They have their own region in the TCAM starting with T6, unlike previous chips where they were in the same region as normal filters.
These filters "hit" before anything else in the LE's lookup. The exact order is: a) High priority filters b) TOE's active region (TCAM and/or hash) c) Servers (TOE hw listeners) d) Normal filters
Sponsored by: Chelsio Communications
r337987: cxgbe(4): Adjust ntids to account for nhptids in the TOE case too. This should have been part of r337538. |
346872 |
29-Apr-2019 |
np |
MFC r337192:
cxgbe(4): Improvements in TID management.
- Ignore any type of TID where the start/end values are not in the correct order. There are situations where the firmware isn't able to reserve room for the number requested in the config file but doesn't report a failure during configuration and instead sets end <= start.
- Track start/end in tid_tab and remove some redundant copies from adapter->params.
- Move all the start/end and other read-only parameters to a quiet part of tid_tab, away from the tid locks.
Sponsored by: Chelsio Communications |
346871 |
29-Apr-2019 |
np |
MFC r336718, r336720, r336734-r336735, r337398, r337439, and r337540. These are all related to tx rate limiting in cxgbe.
r336718: cxgbe(4): Validate only those parameters that are relevant to the type of rate limiter being programmed. Skip the ones that are not applicable.
Sponsored by: Chelsio Communications
r336720: cxgbe(4): Remove useless code that crept in with r336718.
X-MFC With: 336718
r336734: cxgbe(4): Better defaults for all cl-rl rate limiters.
Start in "class" instead of "flow" mode. This eliminates the need to specify an MTU, which is not available that early anyway. It also allows the user to manually configure ch-rl rate limiting after attach. This used to fail because ch-rl isn't supported if cl-rl "flow" mode is configured.
Set all traffic classes to 1Gbps during initialization. The goal is to start off with _any_ valid configuration and 1Gbps works even for gigabit cards.
Sponsored by: Chelsio Communications
r336735: cxgbe(4): Consider rateunit before ratemode when displaying information about a traffic class. This matches the order in which the firmware evaluates unit and mode internally.
Sponsored by: Chelsio Communications
r337398: cxgbe(4): Allow user-configured and driver-configured traffic classes to be used simultaneously. Move sysctl_tc and sysctl_tc_params to t4_sched.c while here.
Sponsored by: Chelsio Communications
r337439: cxgbe(4): Allow the driver to specify a burst size when configuring a traffic class for rate limiting.
Add experimental knobs that allow the user to specify a default pktsize and burstsize for traffic classes associated with a port:
dev.<ifname>.<instance>.tc.pktsize dev.<ifname>.<instance>.tc.burstsize
Sponsored by: Chelsio Communications
r337540: cxgbe(4): Display pkt-size and burst-size in traffic class parameters. |
346870 |
29-Apr-2019 |
np |
MFC r337397:
cxgbe(4): Break up sysctl_bitfield into 8 bit and 16 bit variants. Have them display the current value of the bitfield rather than the fixed value that was provided when the sysctl node was created.
Sponsored by: Chelsio Communications |
346869 |
29-Apr-2019 |
np |
MFC r335223:
cxgbe(4): sysctls to display the local and intr CPUs for the adapter.
The driver assumes the list can change (even though it does't right now) and queries it every time the sysctl runs.
sysctl dev.<nexus>.<inst>.local_cpus sysctl dev.<nexus>.<inst>.intr_cpus
sysctl dev.t6nex.0.local_cpus sysctl dev.t6nex.0.intr_cpus
Sponsored by: Chelsio Communications |
346867 |
29-Apr-2019 |
np |
MFC r335701:
cxgbe/cxgbei: Fix harmful typo in the iSCSI offload driver.
Reported by: gcc8 (via mmacy@) Sponsored by: Chelsio Communications |
346866 |
29-Apr-2019 |
np |
MFC r334467:
cxgbe(4): Retire an old check. |
346864 |
29-Apr-2019 |
np |
MFC r334139:
cxgbe/t4_tom: ABORT_RPL_RSS is a shared CPL and t4_tom shouldn't remove the global handler when it's being unloaded. |
346863 |
29-Apr-2019 |
np |
MFC r334138:
r334138: cxgbe(4): Make FW4_ACK a shared CPL. |
346862 |
29-Apr-2019 |
np |
MFC r334137:
cxgbe(4): Fix range checks in is_etid. |
346861 |
29-Apr-2019 |
np |
MFC r334136:
cxgbe(4): Slightly simpler needs_<foo> functions. |
346860 |
29-Apr-2019 |
np |
MFC r334132:
cxgbe(4): Make sure that the egress queue's cidx is updated periodically when the driver is writing WRs using start_wrq_wr/commit_wrq_wr all the time.
Sponsored by: Chelsio Communications |
346859 |
29-Apr-2019 |
np |
MFC r333141 (by gallatin@):
Optionally panic when cxgbe encounters a fatal error
Sometimes it is better to panic than to leave a machine unreachable.
Reviewed by: np Sponsored by: Netflix |
346855 |
28-Apr-2019 |
np |
MFC r333153, r333394, r333442, r333472, r333620, r334058, r334447, r334452, and r335684. These revisions added hashfilters, NAT offload, and SMAC/DMAC swapping filters to cxgbe.
r333153: cxgbe(4): Move all TCAM filter code into a separate file.
Sponsored by: Chelsio Communications
r333394: cxgbe(4): Add support for hash filters.
These filters reside in the card's memory instead of its TCAM and can be configured via a new "hashfilter" subcommand in cxgbetool. Hash and normal TCAM filters can be used together. The hardware does an exact-match of packet fields for hash filters, unlike the masked match performed for TCAM filters. Any T5/T6 card with memory can support at least half a million hash filters. The sample config file with the driver configures 512K of these, it is possible to double this to 1 million+ in some cases.
The chip does an exact-match of fields of incoming datagrams with hash filters and performs the action configured for the filter if it matches. The fields to match are specified in a "filter mask" in the firmware config file. The filter mask always includes the 5-tuple (sip, dip, sport, dport, ipproto). It can, optionally, also include any subset of the filter mode (see filterMode and filterMask in the firmware config file).
For example: filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe filterMask = protocol, port, vlan
Exact values of the 5-tuple, the physical port, and VLAN tag would have to be provided while setting up a hash filter with the chip configuration above.
Hash filters support all actions supported by TCAM filters. A packet that hits a hash filter can be dropped, let through (with optional steering to a specific queue or RSS region), switched out of another port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed. (Support for some of these will show up in the driver in a follow-up commit very shortly).
Sponsored by: Chelsio Communications
r333442: cxgbe(4): Determine whether the firmware supports the FILTER2 work request, which can be used to configure hardware NAT and swapmac.
All firmwares released after Jan 2017 support this work request.
Sponsored by: Chelsio Communications
r333472: cxgbe(4): Add fields to support configuration of hardware NAT and swapmac (SMAC/DMAC switcheroo) from userspace.
Sponsored by: Chelsio Communications
r333620: cxgbe(4): Filtering related features and fixes.
- Driver support for hardware NAT. - Driver support for swapmac action. - Validate a request to create a hashfilter against the filter mask. - Add a hashfilter config file for T5.
Sponsored by: Chelsio Communications
r334058: cxgbe(4): Only valid filters are expected to have a valid tid.
r334447: cxgbe(4): Add code to deal with the chip's source MAC table (aka SMT).
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r334452: cxgbe(4): Add support for SMAC-rewriting filters.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
r335684: cxgbe(4): Do not leak the filters in the hashfilter table on module unload.
Sponsored by: Chelsio Communications
Relnotes: Yes |
346853 |
28-Apr-2019 |
np |
MFC r333121, r333128
r333121: cxgbe/t4_tom: Use appropriate macros instead of magic math while constructing the atid of an active open work request.
Sponsored by: Chelsio Communications
r333128: cxgbe(4): Convert ACT_OPEN_RPL to a shared CPL.
Reserve 3b in the 14b atid to identify the owner and use it to dispatch the CPL. This allows all CPLs that use an atid to be used as shared CPLs, although ACT_OPEN_RPL is the only one being converted in this revision.
Sponsored by: Chelsio Communications |
346852 |
28-Apr-2019 |
np |
MFC r333114:
cxgbe(4): Use opaque cookies or tid range-checks to determine the intended recipient of a CPL when it can't be determined solely from the opcode. Retire the per-queue handlers for such CPLs in favor of the new scheme.
Sponsored by: Chelsio Communications |
346850 |
28-Apr-2019 |
np |
MFC r333043:
cxgbe(4): Move release_tid to the base NIC driver for future consumers.
Sponsored by: Chelsio Communications. |
346849 |
28-Apr-2019 |
np |
MFC r333030:
cxgbe(4): Break up alloc_tid_tabs and move the atid routines to the base NIC driver. The atid services will be used by new features (hashfilters and inline TLS) that do not involve TOE.
Sponsored by: Chelsio Communications |
346848 |
28-Apr-2019 |
np |
MFC r331902:
r331902: cxgbe: Implement tcp_info handler for connections handled by t4_tom. |
346805 |
28-Apr-2019 |
np |
MFC r317849 (partial), r332506, and r332787.
r317849 (partial, required by r332506): cxgbe/t4_tom: Per-connection rate limiting for TCP sockets handled by the TOE.
Sponsored by: Chelsio Communications
r332506: cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection using t4_tom, and what settings to apply to a connection selected for offload. t4_tom must still be loaded and IFCAP_TOE must still be enabled for full TCP offload to take place on an interface. The difference is that IFCAP_TOE used to be the only knob and would enable TOE for all new connections on the inteface, but now the driver will also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line. Each rule has a "match" part consisting of a socket-type (L = listen, A = active open, P = passive open, D = don't care) and a pcap-filter(7) expression, and a "settings" part that specifies whether to offload the connection or not and the parameters to use if so. The general format of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information. [L] ip && port http => offload [L] port 443 => !offload [L] port ssh => offload [P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno [P] dst port ssh => offload !nagle ecn cong tahoe [P] dst port http => offload [A] dst port 443 => offload tls [A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or passive open and stops at the first match. There is an implicit rule at the end of every policy that prohibits offload when no rule in the policy matches: [D] all => !offload
This is a reworked and expanded version of a patch submitted by Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
r332787: cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag (instead of hardcoded 0xffff, implying no tag) in the routines that process offload policy.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications |
345664 |
28-Mar-2019 |
jhb |
MFC 330040,330041,330079,330884,330946,330947,331649,333068,333810,337722, 340466,340468,340469,340473: Add TOE-based TLS offload.
Note that this requires a modified OpenSSL library.
330040: Fetch TLS key parameters from the firmware.
The parameters describe how much of the adapter's memory is reserved for storing TLS keys. The 'meminfo' sysctl now lists this region of adapter memory as 'TLS keys' if present.
330041: Move ccr_aes_getdeckey() from ccr(4) to the cxgbe(4) driver.
This routine will also be used by the TOE module to manage TLS keys.
330079: Move #include for rijndael.h out of x86-specific region.
The #include was added inside of the conditional by accident and the lack of it broke non-x86 builds.
330884: Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS encryption and TCP segmentation for offloaded connections. Sockets using TLS are required to use a set of custom socket options to upload RX and TX keys to the NIC and to enable RX processing. Currently these socket options are implemented as TCP options in the vendor specific range. A patched OpenSSL library will be made available in a port / package for use with the TLS TOE support.
TOE sockets can either offload both transmit and reception of TLS records or just transmit. TLS offload (both RX and TX) is enabled by setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be enabled on the relevant interface. Transmit offload can be used on any "normal" or TLS TOE socket by using the custom socket option to program a transmit key. This permits most TOE sockets to transparently offload TLS when applications use a patched SSL library (e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL library). Receive offload can only be used with TOE sockets using the TLS mode. The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a list of TCP port numbers. Any connection with either a local or remote port number in that list will be created as a TLS socket rather than a plain TOE socket. Note that although this sysctl accepts an arbitrary list of port numbers, the sysctl(8) tool is only able to set sysctl nodes to a single value. A TLS socket will hang without receiving data if used by an application that is not using a patched SSL library. Thus, the tls_rx_ports node should be used with care. For a server mostly concerned with offloading TLS transmit, this node is not needed as plain TOE sockets will fall back to software crypto when using an unpatched SSL library.
New per-interface statistics nodes are added giving counts of TLS packets and payload bytes (payload bytes do not include TLS headers or authentication tags/MACs) offloaded via the TOE engine, e.g.:
dev.cc.0.stats.rx_tls_octets: 149 dev.cc.0.stats.rx_tls_records: 13 dev.cc.0.stats.tx_tls_octets: 26501823 dev.cc.0.stats.tx_tls_records: 1620
TLS transmit work requests are constructed by a new variant of t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.
TLS transmit work requests require a buffer containing IVs. If the IVs are too large to fit into the work request, a separate buffer is allocated when constructing a work request. This buffer is associated with the transmit descriptor and freed when the descriptor is ACKed by the adapter.
Received TLS frames use two new CPL messages. The first message is a CPL_TLS_DATA containing the decryped payload of a single TLS record. The handler places the mbuf containing the received payload on an mbufq in the TOE pcb. The second message is a CPL_RX_TLS_CMP message which includes a copy of the TLS header and indicates if there were any errors. The handler for this message places the TLS header into the socket buffer followed by the saved mbuf with the payload data. Both of these handlers are contained in tom/t4_tls.c.
A few routines were exposed from t4_cpl_io.c for use by t4_tls.c including send_rx_credits(), a new send_rx_modulate(), and t4_close_conn().
TLS keys for both transmit and receive are stored in onboard memory in the NIC in the "TLS keys" memory region.
In some cases a TLS socket can hang with pending data available in the NIC that is not delivered to the host. As a workaround, TLS sockets are more aggressive about sending CPL_RX_DATA_ACK messages anytime that any data is read from a TLS socket. In addition, a fallback timer will periodically send CPL_RX_DATA_ACK messages to the NIC for connections that are still in the handshake phase. Once the connection has finished the handshake and programmed RX keys via the socket option, the timer is stopped.
A new function select_ulp_mode() is used to determine what sub-mode a given TOE socket should use (plain TOE, DDP, or TLS). The existing set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and handles initialization of TLS-specific state when necessary in addition to DDP-specific state.
Since TLS sockets do not receive individual TCP segments but always receive full TLS records, they can receive more data than is available in the current window (e.g. if a 16k TLS record is received but the socket buffer is itself 16k). To cope with this, just drop the window to 0 when this happens, but track the overage and "eat" the overage as it is read from the socket buffer not opening the window (or adding rx_credits) for the overage bytes.
330946: Remove TLS-related inlines from t4_tom.h to fix iw_cxgbe(4) build.
- Remove the one use of is_tls_offload() and the function. AIO special handling only needs to be disabled when a TOE socket is actively doing TLS offload on transmit. The TOE socket's mode (which affects receive operation) doesn't matter, so remove the check for the socket's mode and only check if a TOE socket has TLS transmit keys configured to determine if an AIO write request should fall back to the normal socket handling instead of the TOE fast path. - Move can_tls_offload() into t4_tls.c. It is not used in critical paths, so inlining isn't that important. Change return type to bool while here.
330947: Fix the check for an empty send socket buffer on a TOE TLS socket.
Compare sbavail() with the cached sb_off of already-sent data instead of always comparing with zero. This will correctly close the connection and send the FIN if the socket buffer contains some previously-sent data but no unsent data.
331649: Use the offload transmit queue to set flags on TLS connections.
Requests to modify the state of TLS connections need to be sent on the same queue as TLS record transmit requests to ensure ordering.
However, in order to use the offload transmit queue in t4_set_tcb_field(), the function needs to be updated to do proper flow control / credit management when queueing a request to an offload queue. This required passing a pointer to the toepcb itself to this function, so while here remove the 'tid' and 'iqid' parameters and obtain those values from the toepcb in t4_set_tcb_field() itself.
333068: Use the correct key address when renegotiating the transmit key.
Previously, get_keyid() was returning the address of the receive key instead of the transmit key when renegotiating the transmit key. This could either hang the card (if a connection was only offloading TLS TX and thus had a receive key address of -1) or cause the connection to fail by overwriting the wrong key (if both RX and TX TLS were offloaded).
333810: Be more robust against garbage input on a TOE TLS TX socket.
If a socket is closed or shutdown and a partial record (or what appears to be a partial record) is waiting in the socket buffer, discard the partial record and close the connection rather than waiting forever for the rest of the record.
337722: Whitespace nit in t4_tom.h
340466: Move the TLS key map into the adapter softc so non-TOE code can use it.
340468: Change the quantum for TLS key addresses to 32 bytes.
The addresses passed when reading and writing keys are always shifted right by 5 as the memory locations are addressed in 32-byte chunks, so the quantum needs to be 32, not 8.
340469: Remove bogus roundup2() of the key programming work request header.
The key context is always placed immediately after the work request header. The total work request length has to be rounded up by 16 however.
340473: Restore the <sys/vmem.h> header to fix build of cxgbe(4) TOM.
vmem's are not just used for TLS memory in TOM and the #include actually predates the TLS code so should not have been removed when the TLS vmem moved in r340466.
Sponsored by: Chelsio Communications |
345629 |
28-Mar-2019 |
np |
MFC r329788 (by jhb@):
Bring in additional constants and message fields for TLS-related messages.
Sponsored by: Chelsio Communications |
345040 |
11-Mar-2019 |
jhb |
MFC 318429,318967,319721,319723,323600,323724,328353-328361,330042,343056: Add a driver for the Chelsio T6 crypto accelerator engine.
Note that with the set of commits in this batch, no additional tunables are needed to use the driver once it is loaded.
318429: Add a driver for the Chelsio T6 crypto accelerator engine.
The ccr(4) driver supports use of the crypto accelerator engine on Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.
Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC, and SHA2-512-HMAC authentication algorithms. The driver also supports chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication algorithm for encrypt-then-authenticate operations.
Note that this driver is still under active development and testing and may not yet be ready for production use. It does pass the tests in tests/sys/opencrypto with the exception that the AES-GCM implementation in the driver does not yet support requests with a zero byte payload.
To use this driver currently, the "uwire" configuration must be used along with explicitly enabling support for lookaside crypto capabilities in the cxgbe(4) driver. These can be done by setting the following tunables before loading the cxgbe(4) driver:
hw.cxgbe.config_file=uwire hw.cxgbe.cryptocaps_allowed=-1
318967: Fail large requests with EFBIG.
The adapter firmware in general does not accept PDUs larger than 64k - 1 bytes in size. Sending crypto requests larger than this size result in hangs or incorrect output, so reject them with EFBIG. For requests chaining an AES cipher with an HMAC, the firmware appears to require slightly smaller requests (around 512 bytes).
319721: Add explicit handling for requests with an empty payload.
- For HMAC requests, construct a special input buffer to request an empty hash result. - For plain cipher requests and requests that chain an AES cipher with an HMAC, fail with EINVAL if there is no cipher payload. If needed in the future, chained requests that only contain AAD could be serviced as HMAC-only requests. - For GCM requests, the hardware does not support generating the tag for an AAD-only request. Instead, complete these requests synchronously in software on the assumption that such requests are rare.
319723: Fix the software fallback for GCM to validate the existing tag for decrypts.
323600: Fix some incorrect sysctl pointers for some error stats.
The bad_session, sglist_error, and process_error sysctl nodes were returning the value of the pad_error node instead of the appropriate error counters.
323724: Enable support for lookaside crypto operations by default.
This permits ccr(4) to be used with the default firmware configuration file.
328353: Always store the IV in the immediate portion of a work request.
Combined authentication-encryption and GCM requests already stored the IV in the immediate explicitly. This extends this behavior to block cipher requests to work around a firmware bug. While here, simplify the AEAD and GCM handlers to not include always-true conditions.
328354: Always set the IV location to IV_NOP.
The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work request.
328355: Reject requests with AAD and IV larger than 511 bytes.
The T6 crypto engine's control messages only support a total AAD length (including the prefixed IV) of 511 bytes. Reject requests with large AAD rather than returning incorrect results.
328356: Don't discard AAD and IV output data for AEAD requests.
The T6 can hang when processing certain AEAD requests if the request sets a flag asking the crypto engine to discard the input IV and AAD rather than copying them into the output buffer. The existing driver always discards the IV and AAD as we do not need it. As a workaround, allocate a single "dummy" buffer when the ccr driver attaches and change all AEAD requests to write the IV and AAD to this scratch buffer. The contents of the scratch buffer are never used (similar to "bogus_page"), and it is ok for multiple in-flight requests to share this dummy buffer.
328357: Fail crypto requests when the resulting work request is too large.
Most crypto requests will not trigger this condition, but a request with a highly-fragmented data buffer (and a resulting "large" S/G list) could trigger it.
328358: Clamp DSGL entries to a length of 2KB.
This works around an issue in the T6 that can result in DMA engine stalls if an error occurs while processing a DSGL entry with a length larger than 2KB.
328359: Expand the software fallback for GCM to cover more cases.
- Extend ccr_gcm_soft() to handle requests with a non-empty payload. While here, switch to allocating the GMAC context instead of placing it on the stack since it is over 1KB in size. - Allow ccr_gcm() to return a special error value (EMSGSIZE) which triggers a fallback to ccr_gcm_soft(). Move the existing empty payload check into ccr_gcm() and change a few other cases (e.g. large AAD) to fallback to software via EMSGSIZE as well. - Add a new 'sw_fallback' stat to count the number of requests processed via the software fallback.
328360: Don't read or generate an IV until all error checking is complete.
In particular, this avoids edge cases where a generated IV might be written into the output buffer even though the request is failed with an error.
328361: Store IV in output buffer in GCM software fallback when requested.
Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM software fallback case for encryption requests.
330042: Don't overflow the ipad[] array when clearing the remainder.
After the auth key is copied into the ipad[] array, any remaining bytes are cleared to zero (in case the key is shorter than one block size). The full block size was used as the length of the zero rather than the size of the remaining ipad[]. In practice this overflow was harmless as it could only clear bytes in the following opad[] array which is initialized with a copy of ipad[] in the next statement.
343056: Reject new sessions if the necessary queues aren't initialized.
ccr reuses the control queue and first rx queue from the first port on each adapter. The driver cannot send requests until those queues are initialized. Refuse to create sessions for now if the queues aren't ready. This is a workaround until cxgbe allocates one or more dedicated queues for ccr.
Relnotes: yes Sponsored by: Chelsio Communications |
344933 |
08-Mar-2019 |
jhb |
MFC 344671: Don't assume all children of a nexus are ports.
Specifically, ccr(4) devices are also children of cxgbe nexus devices. Rather than making assumptions about the child device's softc, walk the list of ports from the nexus' softc to determine if a child is a port in t4_child_location_str(). This fixes a panic when detaching a ccr device. |
344858 |
06-Mar-2019 |
jhb |
MFC 341098: Add read-only sysctls for all tunables in the cxgbe(4) driver. |
344856 |
06-Mar-2019 |
jhb |
MFC 330882: Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning success. This avoids having to pretend to have proper support for unloading when only part of t4_tom_mod_load() has run. - If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly. The module handling code in the kernel invokes MOD_UNLOAD on a module whose MOD_LOAD fails with an error already. |
343404 |
24-Jan-2019 |
np |
MFC r342603: cxgbe(4): Attach to two T540 variants.
Sponsored by: Chelsio Communications |
343061 |
15-Jan-2019 |
jhb |
MFC 340206: Treat the memory lengths for CHELSIO_T4_GET_MEM as unsigned.
Previously attempts to read the MC region were failing since the length was greater than 2^31. |
342751 |
04-Jan-2019 |
jhb |
MFC 340022: Add support for port unit wiring to cxgbe(4).
- Add a bus_child_location_str method to the nexus drivers that prints out 'port=N' as the location string exported via devinfo and the '%location' sysctl node.
- We can't use a bus_hint_device_unit to wire the unit numbers of devices with a fixed devclass as the device gets assigned a unit in make_device() before the device creator can set softc, etc. Instead, when adding a child device, use a helper function much like a bus_hint_device_unit method to look for wiring hints or to return -1 to let the system choose a unit number. This function requires an "at" hint for the port pointing to the nexus device and a "port" hint listing the port number. For example:
hint.cxl.4.at="t5nex0" hint.cxl.4.port="0"
wires cxl4 to the first port on the t5nex0 adapter. |
342750 |
04-Jan-2019 |
jhb |
MFC 340021: Assert that reclaim_tx_descs() is always making forward progress. |
342583 |
29-Dec-2018 |
jhb |
MFC 340304: Use tcp_state_change() in the cxgbe(4) TOE module.
r254889 added tcp_state_change() as a centralized place to log state changes in TCP connections for DTrace. r294869 and r296881 took advantage of this central location to manage per-state counters. However, TOE sockets were still performing some (but not all) state change updates via direct assignments to t_state. This resulted in state counters underflowing when TOE was in use. Fix by using tcp_state_change() when changing a TOE connection's state. |
341481 |
04-Dec-2018 |
vmaffione |
MFC r341145
cxgbe: revert r309725
After the fix contained in r341144, cxgbe does not need anymore to set the IFCAP_NETMAP flag manually.
Reviewed by: np Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17987 |
341477 |
04-Dec-2018 |
vmaffione |
MFC r339639
netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist: - Move large parts of VALE code to a new file and header netmap_bdg.[ch]. This is useful to reuse the code within upcoming projects. - Improvements and bug fixes to pipes and monitors. - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to handle differences between FreeBSD and Linux. - Introduce some new helper functions to handle more host rings and fake rings (netmap_all_rings(), netmap_real_rings(), ...) - Added new sysctl to enable/disable hw checksum in emulated netmap mode. - nm_inject: add support for NS_MOREFRAG
Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17364 |
339404 |
17-Oct-2018 |
np |
MFC r336159:
cxgbe(4): Add a sysctl to report the chip's microprocessor's load averages. This works with debug or custom firmwares only.
sysctl dev.<nexus>.<instance>.loadavg sysctl dev.t6nex.0.loadavg |
339403 |
17-Oct-2018 |
np |
MFC r335352:
cxgbe(4): Some mailbox commands require access to the Tx pipeline and can time out if it's backed up due to a non-stop deluge of PAUSE frames from a misbehaving peer. Detect this situation and toggle MPS TxEn to allow forward progress. |
339402 |
17-Oct-2018 |
np |
MFC r334987:
cxgbe(4): Remove homemade version of htobe32 from the driver.
It was needed only for ia64 where it was implemented as a call to bswapXX, which was always a real function. htobeXX with a constant argument is calculated at compile-time everywhere else. |
339401 |
17-Oct-2018 |
np |
MFC r320426:
cxgbe/t4_tom: Do not include space taken by the TCP timestamp option in the "effective MSS" for the connection. The chip expects it this way. |
339400 |
17-Oct-2018 |
np |
MFC r338254:
cxgbe(4): Use fcmpset instead of cmpset when appropriate. |
339399 |
17-Oct-2018 |
np |
MFC r338924:
cxgbe(4): Link related changes.
- Switch to using 32b port/link capabilities in the driver. The 32b format is used internally by firmwares > 1.16.45.0 and the driver will now interact with the firmware in its native format, whether it's 16b or 32b. Note that the 16b format doesn't have room for 50G, 200G, or 400G speeds.
- Add a bit in the pause_settings knobs to allow negotiated PAUSE settings to override manual settings.
- Ensure that manual link settings persist across an administrative down/up as well as transceiver unplug/replug.
- Remove unused is_*G_port() functions.
Sponsored by: Chelsio Communications |
339398 |
17-Oct-2018 |
np |
MFC r336042:
cxgbe(4): Assume that any unknown flash on the card is 4MB and has 64KB sectors, instead of refusing to attach to the card.
Submitted by: Casey Leedom @ Chelsio Sponsored by: Chelsio Communications |
339397 |
17-Oct-2018 |
np |
MFC r333139:
cxgbe(4): Destroy the cdev before disabling interrupts in driver detach.
Filter work requests are submitted in the nexus cdev's ioctl which then blocks waiting for a reply. If driver detach runs in this state and disables interrupts the ioctl will never complete and detach will hang in destroy_cdev. |
339396 |
17-Oct-2018 |
np |
MFC r325840, r327811, and r329701.
r325840: CXGBE: fix big-endian behaviour
The setbit/clearbit pair casts the bitfield pointer to uint8_t* which effectively treats its contents as little-endian variable. The ffs() function accepts int as the parameter, which is big-endian. Use uint8_t here to avoid mismatch, as we have only 4 doorbells.
Submitted by: Wojciech Macek <wma@freebsd.org> Reviewed by: np Obtained from: Semihalf Sponsored by: QCM Technologies Differential revision: https://reviews.freebsd.org/D13084
r327811: CXGBE: fix get_filt to be endianness-aware
Unconditional 32-bit shift is not endianness-safe. Modify the logic to work both on LE and BE.
Submitted by: Wojciech Macek <wma@freebsd.org> Reviewed by: np Obtained from: Semihalf Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D13102
r329701: CXGBE: implement prefetch on non-Intel architectures
Submitted by: Michal Stanek <mst@semihalf.com> Obtained from: Semihalf Reviewed by: np, pdk@semihalf.com Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D14452 |
339395 |
17-Oct-2018 |
np |
MFC r320419, r337679, r338366, and r338652.
r320419: cxgbe/iw_cxgbe: Disable debug output by default. The help text for the sysctl already says that the default is 0.
r337679: Remove unused stuff from iw_cxgbe.h
r338366: cxgbe/iw_cxgbe: Fix iWARP RDMA + VIMAGE operation by setting the VNET properly in a couple of places in the driver.
r338652: cxgbe/iw_cxgbe: Fix reported build breakage when the kernel configuration has "device cxgbe' but no VIMAGE.
Sponsored by: Chelsio Communications |
339389 |
16-Oct-2018 |
np |
MFC r327254, r327904, and r328994.
r327254: cxgbe/iw_cxgbe: Fix iWARP over VLANs (catch up with r326169).
r327904: cxgbe/iw_cxgbe: Remove duplicates to fix compilation with recent gcc.
r328994: iw_cxgbe: Remove declaration of a function that no longer exists.
Sponsored by: Chelsio Communications |
336667 |
24-Jul-2018 |
np |
cxgbe/iw_cxgbe: Do not call soaccept twice on the same socket.
This is a direct commit to stable/11.
Reported by: Sai Tallamraju @ Netapp Sponsored by: Chelsio Communications |
335561 |
22-Jun-2018 |
np |
cxgbe(4): Determine early in the ioctl whether it is allowed to sleep or not, instead of always starting a non-sleepable operation and re-adjusting later. This ensures that an operation that is allowed to sleep (ifconfig up/down) never fails with EBUSY on the initial attempt to start a synchronized operation.
This is a direct commit to stable/11. The driver ioctl is always allowed to sleep in head.
Sponsored by: Chelsio Communications |
334562 |
03-Jun-2018 |
np |
MFC r333650, r333652, r333682, r334406, r334409-r334410, and r334489.
r333650: cxgbe(4): Claim some more T5 and T6 boards.
r333652: cxgbe(4): Add support for two more flash parts.
r333682: cxgbe(4): Fall back to a failsafe configuration built into the firmware if an error is reported while pre-processing the configuration file that the driver attempted to use.
Also, allow the user to explicitly use the built-in configuration with hw.cxgbe.config_file="built-in"
r334406: cxgbe(4): Consider all supported speeds when building the ifmedia list for a port. Fix other related issues while here: - Require port lock for access to link_config. - Allow 100Mbps operation by tracking the speed in Mbps. Yes, really. - New port flag to indicate that the media list is immutable. It will be used in future refinements.
This also fixes a bug where the driver reports incorrect media with recent firmwares.
r334409: cxgbe(4): Implement ifm_change callback.
r334410: cxgbe(4): Use ifm for ifmedia just like the rest of the kernel.
No functional change.
r334489: cxgbe(4): Include full duplex mediaopt in media that can be reported as active. Always report full duplex in active media.
Sponsored by: Chelsio Communications |
333642 |
15-May-2018 |
np |
MFC r331340, r331342, r331472, r332050, r333276, r333448:
r331340: cxgbe(4): Tunnel congestion drops on a port should be cleared when the stats for that port are cleared.
r331342: cxgbe(4): Do not read MFG diags information from custom boards.
r331472: cxgbe(4): Always initialize requested_speed to a valid value.
This fixes an avoidable EINVAL when the user tries to disable AN after the port is initialized but l1cfg doesn't have a valid speed to use.
r332050: cxgbe(4): Always display an error message if SIOCSIFFLAGS will leave IFF_UP and IFF_DRV_RUNNING out of sync. ifhwioctl in the kernel pays no attention to the return code from the driver ioctl during SIOCSIFFLAGS so these messages are the only indication that the ioctl was called but failed.
r333276: cxgbe(4): Update all firmwares to 1.19.1.0.
r333448: cxgbe(4): Disable write-combined doorbells by default.
This had been the default behavior but was changed accidentally as part of the recent iw_cxgbe+OFED overhaul. Fix another bug in that change while here: the global knob affects all the adapters in the system and should be left alone by per-adapter code.
Approved by: re@ (marius@) Sponsored by: Chelsio Communications |
332288 |
08-Apr-2018 |
brooks |
MFC r331797:
Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size).
Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14900 |
331784 |
30-Mar-2018 |
hselasky |
MFC r330508: Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space, when the link layer is ethernet, requires resolving the remote L3 address into a L2 address. Doing this from the kernel is easy because the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily available. In userspace such an interface does not exist and kernel help is required.
It should be noted that in an IP-based GID environment, the GID itself does not contain all the information needed to resolve the destination IP address. For example information like VLAN ID and SCOPE ID, is not part of the GID and must be fetched from the GID attributes. Therefore a source GID should always be referred to as a GID index. Instead of going through various racy steps to obtain information about the GID attributes from user-space, this is now all done by the kernel.
This patch optimises the L3 to L2 address resolving using the existing create address handle uverbs interface, retrieving back the L2 address as an additional user-space information structure.
This commit combines the following Linux upstream commits:
IB/core: Let create_ah return extended response to user IB/core: Change ib_resolve_eth_dmac to use it in create AH IB/mlx5: Make create/destroy_ah available to userspace IB/mlx5: Use kernel driver to help userspace create ah IB/mlx5: Report that device has udata response in create_ah
Sponsored by: Mellanox Technologies |
331769 |
30-Mar-2018 |
hselasky |
MFC r303505, r303506, r303512, r303513, r303646, r320418, r323082, r326169, r326563, r326649, r326716, r326764, r326765 and r329222:
RoCE/infiniband/iWarp upgrade to Linux 4.9 for kernel and userspace. This commit merges projects/bsd_rdma_4_9 to 11-stable.
Compatibility wrappers have been made for existing 11-stable ibcore APIs, including ib_reg_phys_mr(). Refer to "sys/ofed/include/rdma/ib_verbs_compat.h" for more information.
The iw_cxgb driver has not been updated and has been disconnected from the build.
Sponsored by: Mellanox Technologies
MFC r326169 and r326563: RoCE/infiniband upgrade to Linux v4.9 for kernel and userspace.
List of kernel sources used: ============================
1) kernel sources were cloned from git://github.com/torvalds/linux.git Top commit 69973b830859bc6529a7a0468ba0d80ee5117826 - tag: v4.9, linux-4.9
2) krping was cloned from https://github.com/larrystevenwise/krping Top commit 292a2f1abf0348285e678a82264740d52e4dcfe4
List of userspace sources used: ===============================
1) rdma-core was cloned from https://github.com/linux-rdma/rdma-core.git Top commit d65138ef93af30b3ea249f3a84aa6a24ba7f8a75
2) OpenSM was cloned from git://git.openfabrics.org/~halr/opensm.git Top commit 85f841cf209f791c89a075048a907020e924528d
3) libibmad was cloned from git://git.openfabrics.org/~iraweiny/libibmad.git Tag 1.3.13 with some additional patches from Mellanox.
4) infiniband-diags was cloned from git://git.openfabrics.org/~iraweiny/infiniband-diags.git Tag 1.6.7 with some additional patches from Mellanox.
NOTES: ======
1) The mthca driver has been removed from userspace. 2) All GPLv2 only sources have been removed and where applicable rewritten from scratch under a BSD license. 3) List of fully supported drivers in userspace and kernel: a) iw_cxgbe (Chelsio) b) mlx4ib (Mellanox) c) mlx5ib (Mellanox) 4) WITH_OFED=YES is still required by make in order to build OFED userspace and kernel code. 5) Full support has been added for routable RoCE, RoCE v2.
MFC r326649: Disconnect OFED after r326169 broke all DIRDEPS support for it.
MFC r326716: Correctly define the unordered_map namespace in ofed/libibnetdisc .
This should fix ofed/libibnetdisc compilation with C-compilers different from clang and GCC v4.2.1.
Submitted by: kib Sponsored by: Mellanox Technologies
MFC r326764: ofed: Remove duplicated symbols from the version file.
ld.bfd accepts multiple listing of the same symbol in the version script. lld is stricter and errors out. Since arm64 and sometimes amd64 use lld, we should correct this cosmetic issue.
Sponsored by: Mellanox Technologies Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D13329
MFC r326765: ofed: Define barriers for mips and arm.
I used the strongest barriers available on the architectures, so if the future analysis show that it is excessive, the barriers could be relaxed. Still, it is unlikely that it is meaningful to run IB on 32bit ARM or current MIPS machines, so the change is to make WITH_OFED to pass tinderbox.
Sponsored by: Mellanox Technologies Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D13329
MFC r303505: sdp: Use an mbufq for received control packets.
This is simpler than the hand-rolled queue, and fixes a use-after-free.
Sponsored by: EMC / Isilon Storage Division
MFC r303506: sdp: Destroy the PCB lock before freeing to the zone.
Sponsored by: EMC / Isilon Storage Division
MFC r303512: sdp: Use malloc(9) instead of the Linux compat layer.
SDP transmit and receive rings are always created in a sleepable context, so we can use M_WAITOK and remove error checks.
Sponsored by: EMC / Isilon Storage Division
MFC r303513: sdp: Destroy the RDMA ID after destroying the connection's queue pair.
This is the ordering documented by rdma_destroy_qp(). Also add a useful KASSERT to sdp_pcbfree().
Sponsored by: EMC / Isilon Storage Division
MFC r303646: ipoib: Bound the number of egress mbufs buffered during pathrec lookups.
In pathological situations where the master subnet manager becomes unresponsive for an extended period, we may otherwise end up queuing all of the system's mbufs while waiting for a response to a path record lookup.
This addresses the same issue as commit 1e85b806f9 in Linux.
Reviewed by: cem, ngie Sponsored by: EMC / Isilon Storage Division
MFC r329222: Import the mthca kernel side infiniband driver from Linux 4.9 and fix compilation under FreeBSD. The mthca driver was temporarily removed as part of the Linux 4.9 RoCE/infinband upgrade.
Top commit in Linux source tree: 69973b830859bc6529a7a0468ba0d80ee5117826
Sponsored by: Mellanox Technologies
MFC r320418. Note that the socket lock _is_ the same as so_rcv's lock in 11 and this is a no-op in this branch.
Sponsored by: Chelsio Communications
MFC r323082: cxgbe/iw_cxgbe: Set TCP_NODELAY before initiating connection so that t4_tom picks it up right away. This is less work than waiting for the connection to be established before applying the setting.
Sponsored by: Chelsio Communications |
331722 |
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re) |
331647 |
27-Mar-2018 |
jhb |
MFC 318387: Add support for child devices that aren't ports.
Invoke any identify routines of child drivers during attach before attaching children, and delete any remaining devices after deleting ports.
Sponsored by: Chelsio Communications |
331645 |
27-Mar-2018 |
jhb |
MFC 329785: Move DDP PCB state into a helper structure.
This consolidates all of the DDP state in one place. Also, the code has now been fixed to ensure that DDP state is only accessed for DDP connections. This should not be a functional change but makes it cleaner and easier to add state for other TOE socket modes in the future.
Sponsored by: Chelsio Communications |
330897 |
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg |
330307 |
03-Mar-2018 |
np |
MFC r319506, r319872, r321063, r321103, r321179, r321390, r321435, r321582, r321671, r322014, r322034, r322055, r322123, r322167, r322425, r322549, r322914, r322960, r322962, r322964, r322985, r322990, r323006, r323026, r323041, r323069, r323078, r323343, r323514, r323520, r324296, r324379, r324386, r324443, r324945, r325596, r325680, r325880, r325883-r325884, r325961, r326026, r326042, r327062, r327093, r327332, r327528, r328420, and r328423.
r319506: cxgbe(4): Update the statistics for compound tx work requests once per work request, not once per frame.
r319872: cxgbe(4): Do not request an FEC setting that the port does not support.
r321063: cxgbe(4): Various link/media related improvements.
- Deal with changes to port_type, and not just port_mod when a transceiver is changed. This fixes hot swapping of transceivers of different types (QSFP+ or QSA or QSFP28 in a QSFP28 port, SFP+ or SFP28 in a SFP28 port, etc.).
- Always refresh media information for ifconfig if the port is down. The firmware does not generate tranceiver-change interrupts unless at least one VI is enabled on the physical port. Before this change ifconfig diplayed potentially stale information for ports that were administratively down.
- Always recalculate and reapply L1 config on a transceiver change.
- Display PAUSE settings in ifconfig. The driver sysctls for this continue to work as well.
r321103: cxgbe(4): New ioctls to flash bootrom and boot config to the card.
r321179: cxgbe/t4_tom: Log more details about the newly ESTABLISHED tid to the trace buffer.
r321390: cxgbe(4): Install the firmware bundled with the driver to the card if it doesn't seem to have one. This lets the driver recover automatically from incomplete firmware upgrades (panic, reboot, power loss, etc. in the middle of an upgrade).
r321435: cxgbe(4): Display some more TOE parameters related to retransmission and keepalive in the sysctl MIB. Provide tunables to change some of these parameters. These are supposed to be setup by the firmware so these tunables are for experimentation only.
r321582: cxgbe(4): Some updates to the common code.
- Updated register ranges. - Helper routines for access to TP registers. - Updated routine to read flash parameters.
r321671: cxgbe/iw_cxgbe: Log the end point's history and flags to the trace buffer just before it's freed.
r322014: cxgbe(4): Initial import of the "collect" component of Chelsio unified debug (cudbg) code, hooked up to the main driver via an ioctl.
The ioctl can be used to collect the chip's internal state in a compressed dump file. These dumps can be decoded with the "view" component of cudbg.
r322034: cxgbe(4): Always use the first and not the last virtual interface associated with a port in begin_synchronized_op.
r322055: cxgbe(4): Allow the TOE timer tunables to be set with microsecond precision. These timers are already displayed in microseconds in the sysctl MIB. Add variables to track these tunables while here.
r322123: cxgbe(4): Avoid a NULL dereference that would occur during module unload if there were problems earlier during attach.
r322167: cxgbe(4): Add the T6 and T5 Unified Wire configuration files to the kernel, just like for T4, when the driver is compiled into the kernel.
r322425: cxgbe(4): Save the last reported link parameters and compare them with the current state to determine whether to generate a link-state change notification. This fixes a bug introduced in r321063 that caused the driver to sometimes skip these notifications.
r322549: cxgbe/t4_tom: Use correct name for the ISS-valid bit in options2.
r322914: cxgbe(4): Dump the mailbox contents in the same format as CH_DUMP_MBOX.
r322960: cxgbe(4): Verify that the driver accesses the firmware mailbox in a thread-safe manner.
r322962: cxgbe(4): Remove write only variable from t4_port_init.
r322964: cxgbe(4): vi_mac_funcs should include the base Ethernet function. It is already used in the driver as if it does.
r322985: cxgbe(4): Maintain one ifmedia per physical port instead of one per Virtual Interface (VI). All autonomous VIs that share a port share the same media.
r322990: cxgbe(4): Do not access the mailbox without appropriate locks while creating hardware VIs.
This fixes a bad race on systems with hw.cxgbe.num_vis > 1.
r323006: cxgbe(4): Update T6/T5/T4 firmwares to 1.16.59.0.
r323026: cxgbe(4): Zero out the memory allocated for the debug dump. cudbg_collect seems to expect it this way.
r323041: cxgbe(4): Add two new debug flags -- one to allow manual firmware install after full initialization, and another to disable the TCB cache (T6+). The latter works as a tunable only.
Note that debug_flags are for debugging only and should not be set normally.
r323069: cxgbe/t4_tom: Add a knob to select the congestion control algorigthm used by the TOE hardware for fully offloaded connections. The knob affects new connections only.
r323078: cxgbe/t4_tom: There may not be a tid to update if the connection isn't established.
r323343: cxgbe(4): Fix a couple of problems in the sge_wrq data path.
- start_wrq_wr must not drain the wr_list if there are incomplete_wrs pending. This can happen when a t4_wrq_tx runs between two start_wrq_wr.
- commit_wrq_wr must examine the cookie's pidx and ndesc with the queue's lock held. Otherwise there is a bad race when incomplete WRs are being completed and commit_wrq_wr for the WR that is ahead in the queue updates the next incomplete WR's cookie's pidx/ndesc but the commit_wrq_wr for the second one is using stale values that it read without the lock.
r323514: cxgbetool(8): mode must be specified when creating the dump file.
r323520: cxgbe(4): Ignore capabilities that depend on TOE when the firmware reports TOE is not available.
r324296: cxgbe(4): Provide knobs to set the holdoff parameters of TOE rx queues separately from NIC rx queues instead of using the same parameters for both types of queues.
r324379: cxgbetool(8): Do not create a large file devoid of useful content when the dumpstate ioctl fails. Make the file world-readable while here.
r324386: cxgbe(4): Update T6, T5, and T4 firmwares to 1.16.63.0.
r324443: cxgbetool(8): Do not close uninitialized fd on malloc failure.
r324945: cxgbe(4): Read the MPS buffer group map from the firmware as it could be different from hardware defaults. The congestion channel map, which is still fixed, needs to be tracked separately now. Change the congestion setting for TOE rx queues to match the drivers on other OSes while here.
r325596: cxgbe(4): Do not request settings not supported by the port.
r325680: cxgbe(4): Excluce mdi from the check against port capabilities.
r325880: cxgbe(4): Combine all _10g and _1g tunables and drop the suffix from their names. The finer-grained knobs weren't practically useful.
r325883: cxgbe(4): Sanitize t4_num_vis during MOD_LOAD like all other t4_* tunables. Add num_vis to the intrs_and_queues structure as it affects the number of interrupts requested and queues created. In future cfg_itype_and_nqueues might lower it incrementally instead of going straight to 1 when enough interrupts aren't available.
r325884: cxgbe(4): Remove rsrv_noflowq from intrs_and_queues structure as it does not influence or get affected by the number of interrupts or queues.
r325961: cxgbe(4): Add core Vdd to the sysctl MIB.
r326026: cxgbe(4): Add a custom board to the device id list.
r326042: cxgbe(4): Fix unsafe mailbox access in cudbg.
r327062: cxgbe(4): Read the MFG diags version from the VPD and make it available in the sysctl MIB.
r327093: cxgbe(4): Do not forward interrupts to queues with freelists. This leaves the firmware event queue (fwq) as the only queue that can take interrupts for others.
This simplifies cfg_itype_and_nqueues and queue allocation in the driver at the cost of a little (never?) used configuration. It also allows service_iq to be split into two specialized variants in the future.
r327332: cxgbe(4): Reduce duplication by consolidating minor variations of the same code into a single routine.
r327528: cxgbe(4): Add a knob to enable/disable PCIe relaxed ordering. Disable it by default when running on Intel CPUs.
r328420: cxgbe(4): Do not display harmless warning in non-debug builds.
r328423: cxgbe(4): Accept old names of a couple of tunables.
Sponsored by: Chelsio Communications |
330303 |
03-Mar-2018 |
jhb |
MFC 328608: Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so export this variable properly (including moving it into the tcp_* namespace).
Relative to HEAD the MFC includes two additional changes: - The t3_tom module used for cxgb(4) is also patched. - A strong reference from the new name (tcp_always_keepalive) to the old name (always_keepalive) has been added to preserve the KBI for existing modules.
Suggested by: kib (strong reference) Sponsored by: Chelsio Communications |
329391 |
16-Feb-2018 |
np |
iw_cxgbe: Follow-up fix to r329017, which updated the code associated with QP flush.
This is a direct commit to stable/11.
Sponsored by: Chelsio Communications |
329017 |
08-Feb-2018 |
np |
iw_cxgbe: Manually backport changes related to QP flush. This fixes a panic where poll_cq sees an empty RQ while processing an incoming SEND for a QP that is being taken down.
This is a direct commit to stable/11.
Sponsored by: Chelsio Communications |
325604 |
09-Nov-2017 |
hselasky |
MFC r324792: The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used to indicate iWarp protocol use. Backport the proper IB device capabilities from Linux upstream to distinguish between iWarp and RoCE. Only allocate the additional socket required for iWarp for RDMA IDs when at least one iWarp device present. This resolves interopability issues between iWarp and RoCE in ibcore
Reviewed by: np @ Differential Revision: https://reviews.freebsd.org/D12563 Sponsored by: Mellanox Technologies |
323884 |
21-Sep-2017 |
jhb |
MFC 323630: Avoid reusing the wrong buffer for a DDP AIO request.
To optimize the case of ping-ponging between two buffers, the DDP code caches the last two buffers used keeping the pages wired and page pods stored in the NIC's RAM. If a new aio_read() request uses one of the same buffers, then the work of holding pages, etc. can be avoided. However, the starting virtual address of an aio buffer was not saved, only the page count, length, and initial page offset. Thus, an aio_read() request could match a different buffer in the address space. (Earlier during development vm_fault_hold_quick_pages() was always called and the vm_page_t values were compared, but that was eventually removed without being adequately replaced.) Fix by storing the starting virtual address and comparing that (along with other fields) to determine if a buffer can be reused.
Sponsored by: Chelsio Communications |
319271 |
31-May-2017 |
np |
MFC r318774:
cxgbe/iw_cxgbe: sodisconnect failures are harmless and should not be treated as fatal errors.
Sponsored by: Chelsio Communications |
319269 |
31-May-2017 |
np |
MFC r318762:
cxgbe(4): Update the T4, T5, and T6 firmwares to 1.16.45.0.
The latest firmware has a number of link related fixes, support for a new custom card, and the fix for a bug that affected rate limiting on FreeBSD.
Relnotes: Yes Sponsored by: Chelsio Communications |
318854 |
25-May-2017 |
np |
MFC r318014, r318091, r318125, and r318263.
r318014: cxgbe(4): Fixes related to the knob that controls link autonegotiation.
- Do not leak the adapter lock in sysctl_autoneg. - Accept only 0 or 1 as valid settings for autonegotiation. - A fixed speed must be requested by the driver when autonegotiation is disabled otherwise the firmware will reject the l1cfg command. Use the top speed supported by the port for now.
r318091: cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.
r318125: Adjust whitespace and fix a comment. No functional change.
r318263: cxgbe(4): netmap-only interrupts for a VI do not have an associated rxq or ofld_rxq and should be ignored by vi_intr_iq.
Sponsored by: Chelsio Communications |
318850 |
25-May-2017 |
np |
MFC r317702, r317847, r318307
r317702: cxgbe(4): Support routines for Tx traffic scheduling.
- Create a new file, t4_sched.c, and move all of the code related to traffic management from t4_main.c and t4_sge.c to this file. - Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl) parameters in the PF driver. - Initialize all the cl_rl limiters with somewhat arbitrary default rates and provide routines to update them on the fly. - Provide routines to reserve and release traffic classes.
r317847: cxgbe(4): The Tx scheduler initialization either works or doesn't. It doesn't need a refresh in either case.
r318307: cxgbe(4): Avoid an out of bounds access when an attempt to unbind a tx queue from a traffic class fails.
Sponsored by: Chelsio Communications |
318843 |
25-May-2017 |
np |
MFC r317820 and r317837.
r317820: cxgbe(4): Update the list of PCIe devices claimed by the driver. At this point any board with a T6 should just work.
r317837: cxgbe(4): Update the VF device ids too. This should have been part of r317820.
Sponsored by: Chelsio Communications |
318842 |
25-May-2017 |
np |
MFC r317041:
cxgbe: Add tunables to control the number of LRO entries and the number of rx mbufs that should be presorted before LRO. There is no change in default behavior.
Sponsored by: Chelsio Communications |
318839 |
25-May-2017 |
np |
MFC r316971:
cxgbe: Add a tunable to configure the SGE time scaler, which is available starting with T6. The values in the timer holdoff registers are multiplied by the scaling factor before use.
dev.<nexus>.<n>.holdoff_timers shows the final values of the timers in microseconds.
Sponsored by: Chelsio Communications |
318837 |
24-May-2017 |
np |
MFC r316506:
cxgbe(4): Program the global RSS key once instead of once per ifnet. |
318835 |
24-May-2017 |
np |
MFC r316172:
cxgbe: Don't call t4_edc_err_read for errors not related to the EDCs.
Sponsored by: Chelsio Communications |
318825 |
24-May-2017 |
np |
MFC r309725:
cxgbe(4): netmap does not set IFCAP_NETMAP in an ifnet's if_capabilities any more (since r307394). Do it in the driver instead.
Sponsored by: Chelsio Communications |
318808 |
24-May-2017 |
np |
MFC r313318:
cxgbe(4): Allow tunables that control the number of queues to be set to '-n' to tell the driver to create _up to_ 'n' queues if enough cores are available. For example, setting hw.cxgbe.nrxq10g="-32" will result in 16 queues if the system has 16 cores, 32 if it has 32.
There is no change in the default number of queues of any type.
Sponsored by: Chelsio Communications |
318803 |
24-May-2017 |
np |
MFC r313346:
cxgbe/t4_tom: Fix CLIP entry refcounting on the passive side. Every IPv6 connection being handled by the TOE should have a reference on its CLIP entry.
Sponsored by: Chelsio Communications |
318798 |
24-May-2017 |
np |
MFC r311880, r314167, r316118, r316571, r316573, r316580, r316936-r316937, r316940, and r317410.
r311880: The iw_cxgb and iw_cxgbe drivers should not use a FreeBSD device_t where a linuxkpi style device is expected. If OFED/linuxkpi actually starts using this field then we'll have to figure out whether to create fake devices for these drivers or have linuxkpi deal with NULL device.
This mismatch was first reported as part of D6585.
r314167: cxgbe/iw_cxgbe: Minor changes for T6.
r316118: cxgbe/iw_cxgbe: T6 has no limit on the amount of memory that can be registered in one ib_reg_phys_mr.
r316571: cxgbe/iw_cxgbe: Remove bad cast that resulted in incorrect length for memory regions larger than 4GB.
r316573: cxgbe/iw_cxgbe: Replace a magic constant with something more readable (and accurate).
T4 and later have an extra bit for page shift so the maximum page size is 8TB (shift of 12 + 31) instead of 128MB (12 + 15). This saves space in the chip's PBL (physical buffer list) when registering very large memory regions.
r316580: cxgbe/iw_cxgbe: Remove another bad cast. This should have been included in r316571.
r316936: cxgbe/iw_cxgbe: hw supports 64K (not 32K) Protection Domains.
r316937: cxgbe/iw_cxgbe: Report accurate page_size_cap in ib_query_device.
r316940: cxgbe/iw_cxgbe: Report the actual values of various parameters as configured by the firmware.
r317410: cxgbe/iw_cxgbe: Pull in some updates to c4iw_wait_for_reply from the iw_cxgb4 Linux driver.
Sponsored by: Chelsio Communications |
318796 |
24-May-2017 |
np |
MFC r316774:
cxgbe: Query some more RDMA related parameters from the firmware.
Sponsored by: Chelsio Communications |
318773 |
24-May-2017 |
np |
MFC r311846: cxgbe(4): Refresh t4_msg.h, mainly for definitions related to the crypto engine. |
316122 |
29-Mar-2017 |
np |
MFC r315201, r315920, r315921, r315922, r316008, and r316062.
r315201: cxgbe(4): Fix an always-true assertion (reported by PVS-Studio).
sys/dev/cxgbe/t4_main.c: PVS-Studio: Expression is Always True (CWE-571) (3)
r315920: cxgbe/iw_cxgbe: c4iw_connect should always returns a -ve errno on failure.
r315921:
cxgbe/iw_cxgbe: alloc_ep expects a gfp_t, and it's always ok to sleep during alloc_ep.
r315922: cxgbe/iw_cxgbe: allocations that use GFP_KERNEL (which is M_WAITOK on FreeBSD) cannot fail.
r316008: cxgbe/iw_cxgbe: Remove unused code.
r316062: cxgbe/iw_cxgbe: Defer the handling of error CQEs and RDMA_TERMINATE to the thread that deals with socket state changes. This eliminates various bad races with the ithread. |
315865 |
23-Mar-2017 |
np |
MFC r314814 and r315325.
r314814: cxgbe/iw_cxgbe: Abort connection if there is an error during c4iw_modify_qp.
r315325: cxgbe/iw_cxgbe: Use the socket and not the toepcb to reach for the inpcb. t4_tom detaches the inpcb from the toepcb as soon as the hardware is done with the connection (in final_cpl_received) but the socket is around as long as the cm_id and the rest of iWARP state is.
This fixes an intermittent NULL dereference during abort. |
314775 |
06-Mar-2017 |
np |
MFC r314509 and r314578.
r314509: cxgbe/iw_cxgbe: Do not check the size of the memory region being registered. T4/5/6 have no internal limit on this size. This is probably a copy paste from the T3 iw_cxgb driver.
r314578: cxgbe/iw_cxgbe: Implement sq/rq drain operation.
ULPs can set a qp's state to ERROR and then post a work request on the sq and/or rq. When the reply for that work request comes back it is guaranteed that all previous work requests posted on that queue have been drained.
Sponsored by: Chelsio Communications |
314605 |
03-Mar-2017 |
np |
MFC r314400:
cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets.
Sockets representing the TCP endpoints for iWARP connections are allocated by the ibcore module. Before this revision they were closed either by the ibcore module or the iw_cxgbe hardware driver depending on the state transitions during connection teardown. This is error prone and there were cases where both iw_cxgbe and ibcore closed the socket leading to double-free panics. The fix is to let ibcore close the sockets it creates and never do it in the driver.
- Use sodisconnect instead of soclose (preceded by solinger = 0) in the driver to tear down an RDMA connection abruptly. This does what's intended without releasing the socket's fd reference.
- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This works for all kinds of sockets: clients that initiate connections, listeners, and sockets accepted off of listeners.
Sponsored by: Chelsio Communications |
313179 |
04-Feb-2017 |
jhb |
MFC 312904: Don't drop a reference to the TOE PCB in undo_offload_socket().
undo_offload_socket() is only called by t4_connect() during a connection setup failure, but t4_connect() still owns the TOE PCB and frees ita after undo_offload_socket() returns. Release a reference in undo_offload_socket() resulted in a double-free which panicked when t4_connect() performed the second free. The reference release was added to undo_offload_socket() incorrectly in r299210.
Sponsored by: Chelsio Communications |
313178 |
03-Feb-2017 |
jhb |
MFC 312906: Unregister CPL handlers for TOE-related messages when unloading TOM.
Sponsored by: Chelsio Communications |
313175 |
03-Feb-2017 |
jhb |
MFC 313020: Fix a couple of issues with t4iov probe and attach.
- Check for Chelsio vendor ID in probe routines. - Fail attach instead of faulting if pci_find_dbsf() doesn't find a device.
PR: 216539 Sponsored by: Chelsio Communications |
312524 |
20-Jan-2017 |
np |
MFC r312368: cxgbe/tom: Fix a case where do_pass_accept_req wasn't properly restoring the VNET. |
312187 |
14-Jan-2017 |
np |
MFC r311848: cxgbe(4): Attach to the 2x25 debug card. This is for internal use only. |
312185 |
14-Jan-2017 |
np |
MFC r311831 and r311832.
r311831: cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved in work requests that end at the end of the descriptor ring, even though the pidx wraps around to 0.
r311832: cxgbe(4): Enable automatic cidx flush for all control queues. |
312116 |
14-Jan-2017 |
np |
MFC r311569, r311657, and r311949.
r311569: Fix comment in t4_tom. No functional change.
r311657: cxgbe/t4_tom: Fix tid accounting. An offloaded IPv6 connection uses 2 tids, not 1, in the hardware.
r311949: cxgbe/tom: Add VIMAGE support to the TOE driver.
Active Open: - Save the socket's vnet at the time of the active open (t4_connect) and switch to it when processing the reply (do_act_open_rpl or do_act_establish).
Passive Open: - Save the listening socket's vnet in the driver's listen_ctx and switch to it when processing incoming SYNs for the socket. - Reject SYNs that arrive on an ifnet that's not in the same vnet as the listening socket.
CLIP (Compressed Local IPv6) table: - Add only those IPv6 addresses to the CLIP that are in a vnet associated with one of the card's ifnets.
Misc: - Set vnet from the toepcb when processing TCP state transitions. - The kernel sets the vnet when calling the driver's output routine so t4_push_frames runs in proper vnet context already. One exception is when incoming credits trigger tx within the driver's ithread. Set the vnet explicitly in do_fw4_ack for that case.
Sponsored by: Chelsio Communications |
311506 |
06-Jan-2017 |
np |
MFC r310151 and r311173.
r310151: cxgbe(4): Changes to the default T6 firmware configuration file.
- Disable features that are not supported or not used on FreeBSD. - Increase the RSS table slice per interface. - Increase the share of the TCAM reserved for filtering.
r311173: cxgbe(4): Update T4, T5 and T6 firmwares to 1.16.26.0.
Sponsored by: Chelsio Communications |
311260 |
04-Jan-2017 |
np |
MFC r309666, r310033, r310049, r310100, r310152, and r310807.
r309666: cxgbe(4): unsigned short isn't large enough to store link speed (which is in Mbps) for 100Gbps links.
r310033: cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.
r310049: cxgbe(4): Fix the tid range shown for T6 cards in misc.tids.
r310100: cxgbe(4): Deal with compressed error vectors.
r310152: cxgbe(4): Fix typo in an unused macro.
r310807: cxgbe(4): Updates to link configuration.
- Update struct link_settings and associated shared code.
- Add tunables to control FEC and autonegotiation. All ports inherit these values as their initial settings. hw.cxgbe.fec hw.cxgbe.autoneg
- Add per-port sysctls to control FEC and autonegotiation. These can be modified at any time. dev.<port>.<n>.fec dev.<port>.<n>.autoneg |
309724 |
09-Dec-2016 |
jhb |
MFC 309613: cxgbe(4): Update firmwares from version 1.16.12.0 to 1.16.22.0.
Sponsored by: Chelsio Communications |
309580 |
06-Dec-2016 |
jhb |
MFC 308066: cxgbe(4): Accurate statistics for all chip settings.
There are 4 independent knobs in T5+ chips to include or exclude PAUSE frames from the "total frames" and "multicast frames" counters in either direction. This change lets the driver deal with any combination of these settings. |
309579 |
05-Dec-2016 |
jhb |
MFC 307876: cxgbe(4): Fix bug in the calculation of the number of physically contiguous regions in an mbuf chain.
If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would incorrectly consider the next mbuf's payload physically contiguous based solely on a KVA comparison. |
309578 |
05-Dec-2016 |
jhb |
MFC 307759: cxgbe(4): Dump any mailbox command that times out. |
309575 |
05-Dec-2016 |
jhb |
MFC 307233: cxgbe(4): Allow the interface MTU to be set as high as the actual hardware limit. |
309569 |
05-Dec-2016 |
jhb |
MFC 306821,306823: Permit updating firmware config file in flash.
306821: cxgbe(4): Add an ioctl to copy a firmware config file to the card's flash.
306823: cxgbetool: Add a loadcfg subcommand to allow a user to upload a firmware configuration file to the card. |
309564 |
05-Dec-2016 |
jhb |
MFC 306277: cxgbe(4): Make the location/length of all descriptor rings available in the sysctl MIB. |
309560 |
05-Dec-2016 |
jhb |
MFC 305695,305696,305699,305702,305703,305713,305715,305827,305852,305906, 305908,306062,306063,306137,306138,306206,306216,306273,306295,306301, 306465,309302: Add support for adapters using the Terminator T6 ASIC.
305695: cxgbe(4): Set up fl_starve_threshold2 accurately for T6.
305696: cxgbe(4): Use correct macro for header length with T6 ASICs. This affects the transmit of the VF driver only.
305699: cxgbe(4): Update the pad_boundary calculation for T6, which has a different range of boundaries.
305702: cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.
305703: cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6.
305713: cxgbe(4): Add support for additional port types and link speeds.
305715: cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one of the capabilities of the crypto engine in T6.
305827: cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields to use in tx work requests.
305852: cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will come up as 't6nex' nexus devices with 'cc' ports hanging off them.
The T6 firmware and configuration files will be added as soon as they are released. For now the driver will try to work with whatever firmware and configuration is on the card's flash.
305906: cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.
305908: cxgbe/t4_tom: Update the active/passive open code to support T6. Data path works as-is.
306062: cxgbe(4): Show wcwr_stats for T6 cards.
306063: cxgbe(4): Setup congestion response for T6 rx queues.
306137: cxgbetool: Add T6 support to the SGE context decoder.
306138: Fix typo.
306206: cxgbe(4): Catch up with the different layout of WHOAMI in T6.
Note that the code moved below t4_prep_adapter() as part of this change because now it needs a working chip_id().
306216: cxgbe(4): Fix the output of the "tids" sysctl on T6.
306273: cxgbe(4): Fix netmap with T6, which doesn't encapsulate SGE_EGR_UPDATE message inside a FW_MSG. The base NIC already deals with updates in either form.
306295: cxgbe(4): Support SIOGIFXMEDIA so that ifconfig displays correct media for 25Gbps and 100Gbps ports. This should have been part of r305713, which is when the driver first started reporting extended media types.
306301: cxgbe(4): Use the port's top speed to figure out whether it is "high speed" or not (for the purpose of calculating the number of queues etc.) This does the right thing for 25Gbps and 100Gbps ports.
306465: cxgbe(4): Claim the T6 -DBG card.
309302: cxgbe(4): Include firmware for T6 cards in the driver. Update all firmwares to 1.16.12.0.
Sponsored by: Chelsio Communications |
309559 |
05-Dec-2016 |
jhb |
MFC 305667: cxgbe(4): Avoid a NULL dereference in the clearstats ioctl handler. Port softc's are not initialized when the adapter is in recovery mode. |
309558 |
05-Dec-2016 |
jhb |
MFC 305652: cxgbe(4): Do not prescreen frames before attempting LRO. |
309557 |
05-Dec-2016 |
jhb |
MFC 305433: cxgbe/t4_tom: toepcb should be all-zero on allocation because the code that cleans up on failure assumes that non-NULL values indicate initialized items. |
309555 |
05-Dec-2016 |
jhb |
MFC 303688,303750,305166,305167: Centralize and rework page pod handling.
303688: cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a per-adapter data structure. This replaces a global array with hardcoded page sizes.
303750: cxgbe/t4_tom: The page pod arena allocates from pod address space and not index space. The minimum valid allocation out of this arena is the size of a single page pod.
305166: cxgbe/t4_tom: Add general purpose routines to deal with page pod regions and allocations within them. Switch to these routines to manage the TOE DDP region.
305167: cxgbe/t4_tom: Two new routines to allocate and write page pods for a buffer in the kernel's address space.
Sponsored by: Chelsio Communications |
309459 |
03-Dec-2016 |
jhb |
MFC 303348: cxgbe(4): Initialize the adapter queues (fwq and mgmtq) instead of returning EAGAIN if they aren't available when the user tries to program a filter. Do this after validating the filter so that the driver doesn't bring up the queues if it doesn't have to. |
309458 |
03-Dec-2016 |
jhb |
MFC 302440,304873,305704,305985,306787,307531: Fixes for sysctls.
302440: cxgbe(4): Add sysctl to display the RSS indirection table size for an interface.
dev.cxl.<n>.rss_size dev.vcxl.<n>.rss_size
304873: cxgbe(4): Provide more details about the card in the sysctl MIB.
dev.t5nex.0.%desc: Chelsio T580-CR dev.t5nex.0.hw_revision: 1 dev.t5nex.0.sn: PT13140042 dev.t5nex.0.pn: 110117150A0 dev.t5nex.0.ec: 0000000000000000 dev.t5nex.0.na: 0007432AF490 dev.t5nex.0.vpd_version: 3 dev.t5nex.0.scfg_version: 53255 dev.t5nex.0.bs_version: 1.1.0.0 dev.t5nex.0.er_version: 1.0.0.68 dev.t5nex.0.tp_version: 0.1.4.9 dev.t5nex.0.firmware_version: 1.16.2.0
305704: cxgbe(4): Rename the debug_flags driver tunable/sysctl to dflags. Tunables that end with _flags are special.
305985: cxgbe(4): Fixes to wrq stats.
- Increment tx_wrs_copied in the correct place. - Add tx_wrs_sspace to the sysctl MIB.
306787: cxgbe(4): Fix whitespace in the pm_stats display.
307531: cxgbe(4): Adjust whitespace to line up the column titles in cim_qcfg with the values displayed.
Sponsored by: Chelsio Communications |
309450 |
03-Dec-2016 |
jhb |
MFC 304854: cxgbe/iw_cxgbe: Various fixes to the iWARP driver.
- Return appropriate error code instead of ENOMEM when sosend() fails in send_mpa_req. - Fix for problematic race during destroy_qp. - Abortive close in the failure of send_mpa_reject() instead of normal close. - Remove the unnecessary doorbell flowcontrol logic.
Sponsored by: Chelsio Communications |
306770 |
06-Oct-2016 |
jhb |
MFC 303754: Add __printflike() to bus_describe_intr() to enable -Wformat checks.
Fix a few places that were passing a raw string as the format to use a "%s" format string instead. |
306694 |
04-Oct-2016 |
jhb |
MFC 303859,305851: Fix a typo and some whitespace nits. |
306693 |
04-Oct-2016 |
jhb |
MFC 303454: Mark spg_len and fl_pktshift static.
These variables are no longer exported to t4_netmap.c after r296478. |
306692 |
04-Oct-2016 |
jhb |
MFC 304482: Adjust t4_port_init() to work with VF devices.
Specifically, the FW_PORT_CMD may or may not work for a VF (the PF driver can choose whether or not to permit access to this command), so don't attempt to fetch port information on a VF if permission is denied by the PF. |
306690 |
04-Oct-2016 |
jhb |
MFC 305548: Don't break out of the m_advance() loop if len drops to zero.
If a packet contains the Ethernet header (14 bytes) in the first mbuf and the payload (IP + UDP + data) in the second mbuf, then the attempt to fetch the l3hdr will return a NULL pointer. The first loop iteration will drop len to zero and exit the loop without setting 'p'. However, the desired data is at the start of the second mbuf, so the correct behavior is to loop around and let the conditional set 'p' to m_data of the next mbuf (and leave offset as 0). |
306664 |
04-Oct-2016 |
jhb |
MFC 303522,303647,303860,303880,304168,304169,304170,304479,304485,305549: Chelsio T4/T5 VF driver.
303522: Various fixes to the t4/5nex character device.
- Remove null open/close methods. - Don't set d_flags to 0 explicitly. - Remove t5_cdevsw as the .d_name member isn't really used and doesn't warrant a separate cdevsw just for the name. - Use ENOTTY as the error value for an unknown ioctl request. - Use make_dev_s() to close race with setting si_drv1.
303647: Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices. Storing the offset in a value in the softc allows code to be shared between the PF and VF drivers.
303860: Reserve an adapter flag IS_VF to mark VF devices vs PF devices.
303880: Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt handlers. For the regular cxgbe/cxl drivers this should be a no-op as the base absolute ID should be zero. VF devices have a non-zero base absolute ID and require this change. While here, export the absolute ID of egress queues via a sysctl.
304168: Make SGE parameter handling more VF-friendly.
Add fields to hold the SGE control register and free list buffer sizes to the sge_params structure. Populate these new fields in t4_init_sge_params() for PF devices and change t4_read_chip_settings() to pull these values out of the params structure instead of reading registers directly. This will permit t4_read_chip_settings() to be reused for VF devices which cannot read SGE registers directly.
While here, move the call to t4_init_sge_params() to get_params__post_init(). The VF driver will populate the SGE parameters structure via a different method before calling t4_read_chip_settings().
304169: Update mailbox writes to work with VF devices.
- Use alternate register locations for the data and control registers for VFs. - Do a dummy read to force the writes to the mailbox data registers to post before the write to the control register on VFs. - Do not check the PCI-e firmware register for errors on VFs.
304170: Add support for register dumps on VF devices.
- Add handling of VF register sets to t4_get_regs_len() and t4_get_regs(). - While here, use t4_get_regs_len() in the ioctl handler for regdump instead of inlining it.
304479: Add structures for VF-specific adapter parameters.
While here, mark which parameters are PF-specific and which are VF-specific.
304485: Reorder sysctls so that nodes shared with the VF driver are added first.
This permits a single early return for VF devices in the routines that add sysctl nodes.
305549: Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio T4 and T4 adapters. The VF devices share most of their code with the existing PF4 driver (cxgbe/cxl) and as such the VF device driver currently depends on the PF4 driver.
Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf PCI device driver that attaches to the VF device. It then creates child cxgbev/cxlv devices representing ports assigned to the VF. By default, the PF driver assigns a single port to each VF.
t4vf_hw.c contains VF-specific routines from the shared code used to fetch VF-specific parameters from the firmware.
t4_vf.c contains the VF-specific PCI device driver and includes its own attach routine.
VF devices are required to use a different firmware request when transmitting packets (which in turn requires a different CPL message to encapsulate messages). This alternate firmware request does not permit chaining multiple packets in a single message, so each packet results in a firmware request. In addition, the different CPL message requires more detailed information when enabling hardware checksums, so parse_pkt() on VF devices must examine L2 and L3 headers for all packets (not just TSO packets) for VF devices. Finally, L2 checksums on non-UDP/non-TCP packets do not work reliably (the firmware trashes the IPv4 fragment field), so IPv4 checksums for such packets are calculated in software.
Most of the other changes in the non-VF-specific code are to expose various variables and functions private to the PF driver so that they can be used by the VF driver.
Note that a limited subset of cxgbetool functions are supported on VF devices including register dumps, scheduler classes, and clearing of statistics. In addition, TOE is not supported on VF devices, only for the PF interfaces.
Sponsored by: Chelsio Communications |
306661 |
03-Oct-2016 |
jhb |
MFC 303405: Add support for zero-copy aio_write() on TOE sockets.
AIO write requests for a TOE socket on a Chelsio T4+ adapter can now DMA directly from the user-supplied buffer. This is implemented by wiring the pages backing the user-supplied buffer and queueing special mbufs backed by raw VM pages to the socket buffer. The TOE code recognizes these special mbufs and builds a sglist from the VM page array associated with the mbuf when queueing a work request to the TOE.
Because these mbufs do not have an associated virtual address, m_data is not valid. Thus, the AIO handler does not invoke sosend() directly for these mbufs but instead inlines portions of sosend_generic() and tcp_usr_send().
An aiotx_buffer structure is used to describe the user buffer (e.g. it holds the array of VM pages and a reference to the AIO job). The special mbufs reference this structure via m_ext. Note that a single job might be split across multiple mbufs (e.g. if it is larger than the socket buffer size). The 'ext_arg2' member of each mbuf gives an offset relative to the backing aiotx_buffer. The AIO job associated with an aiotx_buffer structure is completed when the last reference to the structure is released.
Zero-copy aio_write()'s for connections associated with a given adapter can be enabled/disabled at runtime via the 'dev.t[45]nex.N.toe.tx_zcopy' sysctl.
Sponsored by: Chelsio Communications |
306660 |
03-Oct-2016 |
jhb |
MFC 303205,303722,305032,305752: Create VF devices on Chelsio T4/T5 NICs.
303205: Add a driver to create VF devices on Chelsio T4/T5 NICs.
Chelsio NICs are a bit unique compared to some other NICs in that they expose different functionality on different physical functions. In particular, PF4 is used to manage the NIC interfaces ('t4nex' and 't5nex'). However, PF4 is not able to create VF devices. Instead, VFs are only supported by physical functions 0 through 3. This commit adds 't4iov' and 't5iov' drivers that attach to PF0-3.
One extra wrinkle is that the iov devices cannot enable SR-IOV until the firwmare has been initialized by the main PF4 driver. To handle this case, a new t4_if kobj interface has been added to permit cross-calls between the PF drivers. The PF4 driver notifies sibling drivers when it is fully attached. It also requests sibling drivers to detach before it detaches. Sibling drivers query the PF4 driver during their attach routine to see if it is attached. If not, the sibling drivers defer their attach actions until the PF4 driver informs them it is attached.
VF devices are associated with a single port on the NIC. VF devices created from PF0 are associated with the first port on the NIC, VFs from PF1 are associated with the second port, etc. VF devices can only be created from a PF device that has an associated port. Thus, on a 2-port card, VFs are only supported on PF0 and PF1.
303722: Use the port device name for the iov device for Chelsio T4/T5 cards.
Chelsio T4/T5 adapters are multifunction cards. The main driver uses physical function 4 (PF4). However, VF devices for SR-IOV are only supported on physical functions 0 through 3, where PF0 creates VFs tied to port 0, etc. The t4iov/t5iov driver was previously added to create VF devices for ports that are present on each adapter. This change uses the recently added pci_iov_attach_name() function to name the character device in /dev/iov after the associated port on the card (e.g. /dev/iov/cxl0 is used to create VFs that share the cxl0 port). With this in place, mark the t4iov/t5iov devices quiet to prevent them from cluttering dmesg.
305032: Use device_verbose() to undo device_quiet() when detaching from t[45]iovX.
The device quiet flag is not automatically reset on detach, so it is inherited by other device drivers (e.g. when switching a device driver over to ppt for PCI pass through). Cope with this behavior by explicitly marking the device verbose during detach so that the next driver can make its own decision.
305752: Remove explicit device_verbose() from the t4iov driver detach routine now that this case is handled generically.
Sponsored by: Chelsio Communications |
306463 |
29-Sep-2016 |
jhb |
MFC 303204: Install a handler for firmware work request error messages.
If a driver sends an malformed or disallowed work request, the firmware responds with a work request error. Previously the driver treated this is as an unexpected message and panicked. Now it decodes the error message to aid in debugging.
Sponsored by: Chelsio Communications |
303686 |
02-Aug-2016 |
ngie |
MFC r302581:
Remove redundant declaration for tcp_dooptions, similar to r302576
netinet/tcp_var.h already defines this function
Approved by: re (gjb) PR: 209920 |
302408 |
08-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
302339 |
05-Jul-2016 |
np |
cxgbe(4): Changes to the CPL-handler registration mechanism and code related to "shared" CPLs.
a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single function. Allow callers to direct the response to any iq. Tidy up set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of magic constants.
b) Remove all CPL handler tables from struct adapter. This reduces its size by around 2KB. All handlers are now registered at MOD_LOAD instead of attach or some kind of initialization/activation. The registration functions do not need an adapter parameter any more.
c) Add per-iq handlers to deal with CPLs whose destination cannot be determined solely from the opcode. There are 2 such CPLs in use right now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq. t4_tom (including the DDP code) now uses the port's ctrlq to send L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq. fwq and ofld_rxq have different handlers that know what kind of tid to expect in the reply. Update t4_write_l2e and callers to to support any wrq/iq combination.
Approved by: re@ (kib@) Sponsored by: Chelsio Communications
|
302313 |
01-Jul-2016 |
np |
cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries used by switching filters that rewrite L2 information do not have any associated ifnet.
Approved by: re@ (gjb@) Sponsored by: Chelsio Communications
|
302263 |
29-Jun-2016 |
np |
cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it. The interface's queues are functional after VI_INIT_DONE (which is short of interface-up) and that's all that's needed for t4_tom to communicate with the chip.
Approved by: re@ (gjb@) Sponsored by: Chelsio Communications
|
302110 |
23-Jun-2016 |
np |
cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main cxgbe/cxl interfaces and tunables related to them are not affected by any of this and will continue to operate as usual.
The driver used to create an additional 'n' interface for every cxgbe/cxl interface if "device netmap" was in the kernel. The 'n' interface shared the wire with the main interface but was otherwise autonomous (with its own MAC address, etc.). It did not have normal tx/rx but had a specialized netmap-only data path. r291665 added another set of virtual interfaces (the 'v' interfaces) to the driver. These had normal tx/rx but no netmap support.
This revision consolidates the features of both the interfaces into the 'v' interface which now has a normal data path, TOE support, and native netmap support. The 'v' interfaces need to be created explicitly with the hw.cxgbe.num_vis tunable. This means "device netmap" will not result in the automatic creation of any virtual interfaces.
The following tunables can be used to override the default number of queues allocated for each 'v' interface. nofld* = 0 will disable TOE on the virtual interface and nnm* = 0 to will disable native netmap support.
# number of normal NIC queues hw.cxgbe.ntxq_vi hw.cxgbe.nrxq_vi
# number of TOE queues hw.cxgbe.nofldtxq_vi hw.cxgbe.nofldrxq_vi
# number of netmap queues hw.cxgbe.nnmtxq_vi hw.cxgbe.nnmrxq_vi
hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.
--- tl;dr version --- The workflow for netmap on cxgbe starting with FreeBSD 11 is: 1) "device netmap" in the kernel config. 2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll end up with multiple autonomous netmap-capable interfaces for every port. 3) "dmesg | grep vcxl | grep netmap" to verify that the interface has netmap queues. 4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... . One major improvement is that the netmap interface has a normal data path as expected. 5) Just ignore the cxl interfaces if you want to use netmap only. No need to bring them up. The vcxl interfaces are completely independent and everything should just work. ---------------------
Approved by: re@ (gjb@) Relnotes: Yes Sponsored by: Chelsio Communications
|
302074 |
21-Jun-2016 |
jhb |
Account for AIO socket operations in thread/process resource usage.
File and disk-backed I/O requests store counts of read/written disk blocks in each AIO job so that they can be charged to the thread that completes an AIO request via aio_return() or aio_waitcomplete(). This change extends AIO jobs to store counts of received/sent messages and updates socket backends to set these counts accordingly. Note that the socket backends are careful to only charge a single messages for each AIO request even though a single request on a blocking socket might invoke sosend or soreceive multiple times. This is to mimic the resource accounting of synchronous read/write.
Adjust the UNIX socketpair AIO test to verify that the message resource usage counts update accordingly for aio_read and aio_write.
Approved by: re (hrs) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6911
|
301932 |
15-Jun-2016 |
jhb |
Use sbused() instead of sbspace() to avoid signed issues.
Inserting a full mbuf with an external cluster into the socket buffer resulted in sbspace() returning -MLEN. However, since sb_hiwat is unsigned, the -MLEN value was converted to unsigned in comparisons. As a result, the socket buffer was never autosized. Note that sb_lowat is signed to permit direct comparisons with sbspace(), but sb_hiwat is unsigned. Follow suit with what tcp_output() does and compare the value of sbused() with sb_hiwat instead.
Approved by: re (gjb) Sponsored by: Chelsio Communications
|
301930 |
15-Jun-2016 |
jhb |
Move backend-specific fields of kaiocb into a union.
This reduces the size of kaiocb slightly. I've also added some generic fields that other backends can use in place of the BIO-specific fields.
Change the socket and Chelsio DDP backends to use 'backend3' instead of abusing _aiocb_private.status directly. This confines the use of _aiocb_private to the AIO internals in vfs_aio.c.
Reviewed by: kib (earlier version) Approved by: re (gjb) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6547
|
301898 |
14-Jun-2016 |
np |
cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA connections and not others that are allowed to fail the receive window check.
Approved by: re (gjb@)
|
301897 |
14-Jun-2016 |
np |
iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN. Release the hold on ep->com immediately after sending the RST. This fixes a bug that sometimes leaves userspace iWARP tools hung when the user presses ^C.
Submitted by: Krishnamraju Eraparaju @ Chelsio Approved by: re (gjb@) Sponsored by: Chelsio Communications
|
301628 |
08-Jun-2016 |
np |
cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
Sponsored by: Chelsio Communications
|
301542 |
07-Jun-2016 |
np |
cxgbe(4): A couple of fixes to set_sched_queue.
- Validate the scheduling class against the actual limit (which is chip specific) instead of a magic number.
- Return an error if an attempt is made to manipulate the tx queues of a VI that hasn't been initialized.
Sponsored by: Chelsio Communications
|
301540 |
07-Jun-2016 |
np |
cxgbe(4): Provide information about traffic classes in the sysctl mib.
Sponsored by: Chelsio Communications
|
301535 |
07-Jun-2016 |
np |
cxgbe(4): Track the state of the hardware traffic schedulers in the driver. This works as long as everyone uses set_sched_class_params to program them.
Sponsored by: Chelsio Communications
|
301531 |
06-Jun-2016 |
np |
cxgbe(4): Break up set_sched_class. Validate the channel number and min/max rates against their actual limits (which are chip and port specific) instead of hardcoded constants.
Sponsored by: Chelsio Communications
|
301520 |
06-Jun-2016 |
np |
cxgbe(4): Create a reusable struct type for scheduling class parameters.
Sponsored by: Chelsio Communications
|
301158 |
01-Jun-2016 |
np |
iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory (which is where the lock resides).
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
|
301119 |
01-Jun-2016 |
trasz |
Reduce the priority of cxgbei(4) driver, so it doesn't get chosen by default. This is a workaround for a too simplistic ICL module choosing mechanism. To use it, specify offload in ctl.conf or iscsi.conf.
This fixes a problem where "kldload cxgbei" wedges the iSCSI stack, if you don't have a Chelsio card installed, or the endpoints of the iSCSI session are not reachable through addresses configured on that interface.
Reviewed by: np@ MFC after: 1 month
|
300895 |
28-May-2016 |
np |
cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to avoid panicking debug kernels.
t4_tom does not keep track of a connection once it switches to ULP mode iWARP. If the connection falls out of ULP mode the driver/hardware seq# etc. are out of sync. A better fix would be to figure out what the current seq# are, update the driver's state, and perform all sanity checks as usual.
|
300888 |
27-May-2016 |
np |
iw_cxgbe: Plug a lock leak in process_mpa_request().
If the parent is DEAD or connect_request_upcall() fails, the parent mutex is left locked. This leads to a hang when process_mpa_request() is called again for another child of the listening endpoint.
Submitted by: Krishnamraju Eraparaju @ Chelsio Obtained from: upstream iw_cxgb4 Sponsored by: Chelsio Communications
|
300875 |
27-May-2016 |
np |
iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations.
Submitted by: Krishnamraju Eraparaju at Chelsio Reviewed by: Steve Wise Sponsored by: Chelsio Communications
|
300676 |
25-May-2016 |
hselasky |
Prepare for activation of LinuxKPI module parameters as read-only tunable SYSCTL's. Linux module parameters are associated with the module they belong to. FreeBSD does not share this concept of a parent module. Instead add macros which define the prefix to use for the module parameters in the LinuxKPI consumers.
While at it convert all "bool" LinuxKPI module parameters to "byte" type, because we don't have a "bool" type of SYSCTL in FreeBSD.
Sponsored by: Mellanox Technologies MFC after: 1 week
|
300592 |
24-May-2016 |
trasz |
Add mechanism for choosing iSER-capable ICL modules.
MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
300369 |
21-May-2016 |
trasz |
Provide a way for ICL modules to declare they support PIM_UNMAPPED.
MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
300336 |
20-May-2016 |
jhb |
Move the KTR for the update of ddp_active_id on each completion under VERBOSE_TRACES.
Sponsored by: Chelsio Communications
|
300040 |
17-May-2016 |
trasz |
Extend the ICL interface to include the PDU pointer in the task_setup method. This is required for upcoming iSER support.
Obtained from: Mellanox Technologies (earlier version) MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
299685 |
13-May-2016 |
np |
cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0.
These firmwares were obtained from the "Chelsio T5/T4 Unified Wire v2.12.0.3 for Linux" release. Changes since 1.14.4.0 (which is the firmware in -STABLE branches) are in the "Release Notes" accompanying the Unified Wire release and are copy-pasted here as well.
22.1. T5 Firmware +++++++++++++++++++++++++++++++++
Version : 1.15.37.0 Date : 04/27/2016 ================================================================================
FIXES -----
BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress queue was ignored. - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Fixed 40G link failures with some switches when auto-negotiation enabled. - Fixed to improve on link bring-up time. - Per port buffer groups size doubled to improve performance. - Fixed an issue where bogus d3hot bits were set causing traffic stall. - Fixed an issue where sometimes adapter was not seen after reboot. - Fixed an issue where iWARP was crashing in conjunction with traffic management. - Fixed an issue where link failed to come up after removing twinax cable and inserting optical module.
ETH - Fixed a link flap issue on T580-CR.
OFLD - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.
FOiSCSI - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to use given inerface. - Fixed an issue where fw was sending ENETUNREACH event for normal tcp disconnection.
DCBX - Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE) - Fixed an issue where apply bit set for APP id was affecting the ETS and PFC settings.(DCBX IEEE) - Fixed an issue where app priority values are not handled correctly in fw. (DCBX IEEE) - Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE)
FOFCoE - Removed BB6 support.
ENHANCEMENTS ------------
BASE: - Added new interface to program DCA settings in SGE contexts; allow 32-byte IQE size - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment. - Added MPS raw interface.
ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
OFLD: - WR opcode is returned to host in cqe error response.
22.2. T4 Firmware +++++++++++++++++
Version : 1.15.37.0 Date : 04/27/2016 ================================================================================
FIXES -----
BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue was ignored. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Per port buffer groups size doubled to improve performance. - Fixed an issue where iWARP was crashing in conjunction with traffic management.
FOiSCSI: - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to use given inerface.
DCBX - Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE) - Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE)
FOiSCSI - Fixes an issue where fw was sending ENETUNREACH event for normal tcp disconnection.
FOFCoE - Removed BB6 support.
ENHANCEMENTS ------------
BASE: - Added MPS raw interface.
ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx. ================================================================================
Obtained from: Chelsio Communications MFC after: 6 weeks Relnotes: yes Sponsored by: Chelsio Communications
|
299468 |
11-May-2016 |
hselasky |
The idr_for_each() function is now part of the LinuxKPI. Use the LinuxKPI's idr_for_each() function instead of the local one to avoid compilation issues.
Discussed with: np @ MFC after: 1 week
|
299309 |
10-May-2016 |
jhb |
Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h.
|
299283 |
09-May-2016 |
jhb |
Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h.
Other structures needed by prototypes in t4_tom.h are explicitly declared in this file, so adding the prototype here seems most consistent with existing code.
|
299210 |
07-May-2016 |
jhb |
Use DDP to implement zerocopy TCP receive with aio_read().
Chelsio's TCP offload engine supports direct DMA of received TCP payload into wired user buffers. This feature is known as Direct-Data Placement. However, to scale well the adapter needs to prepare buffers for DDP before data arrives. aio_read() is more amenable to this requirement than read() as applications often call read() only after data is available in the socket buffer.
When DDP is enabled, TOE sockets use the recently added pru_aio_queue protocol hook to claim aio_read(2) requests instead of letting them use the default AIO socket logic. The DDP feature supports scheduling DMA to two buffers at a time so that the second buffer is ready for use after the first buffer is filled. The aio/DDP code optimizes the case of an application ping-ponging between two buffers (similar to the zero-copy bpf(4) code) by keeping the two most recently used AIO buffers wired. If a buffer is reused, the aio/DDP code is able to reuse the vm_page_t array as well as page pod mappings (a kind of MMU mapping the Chelsio NIC uses to describe user buffers). The generation of the vmspace of the calling process is used in conjunction with the user buffer's address and length to determine if a user buffer matches a previously used buffer. If an application queues a buffer for AIO that does not match a previously used buffer then the least recently used buffer is unwired before the new buffer is wired. This ensures that no more than two user buffers per socket are ever wired.
Note that this feature is best suited to applications sending a steady stream of data vs short bursts of traffic.
Discussed with: np Relnotes: yes Sponsored by: Chelsio Communications
|
299206 |
06-May-2016 |
jhb |
Set the correct vnet in TOE event handlers.
Differential Revision: https://reviews.freebsd.org/D6152
|
298976 |
03-May-2016 |
pfg |
Revert r298955 for the cxgbe firmware.
These files have checksums that are none of my business.
Requested by: np
|
298955 |
03-May-2016 |
pfg |
sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
|
298848 |
30-Apr-2016 |
pfg |
sys: Make use of our rounddown() macro when sys/param.h is available.
No functional change.
|
298482 |
22-Apr-2016 |
pfg |
Cleanup redundant parenthesis from existing howmany()/roundup() macro uses.
|
298433 |
21-Apr-2016 |
pfg |
sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code and when the code has a high indentation level it was not really advantageous to do the replacement.
This tries to strike a balance between readability using the macros and flexibility of having the expressions, so not everything is converted.
|
297883 |
12-Apr-2016 |
np |
cxgbe(4): Always dispatch all work requests that have been written to the descriptor ring before leaving drain_wrq_wr_list.
|
297875 |
12-Apr-2016 |
np |
cxgbe(4): Always read the entire mailbox into the reply buffer.
The size of the reply can be different from the size of the command in case a debug firmware asserts. fw_asrt() needs the entire reply in order to decode the location of the assert.
Sponsored by: Chelsio Communications
|
297863 |
12-Apr-2016 |
jhb |
Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.
This fixes a conflict with the M_B macro in powerpc's <machine/db_machdep.h> exposed by the recent addition of DDB commands to the cxgbe driver.
Discussed with: np Reported by: bz Sponsored by: Chelsio Communications
|
297797 |
11-Apr-2016 |
np |
cxgbe(4): Provide an explicit value for nqpcq in the firmware configuration file.
|
297793 |
10-Apr-2016 |
pfg |
Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
|
297779 |
10-Apr-2016 |
jhb |
Add a 'show t4 devlog <nexus>' DDB command.
This command displays the adapter's firmware device log similar to the dev.<nexus>.misc.devlog sysctl.
Sponsored by: Chelsio Communications
|
297777 |
10-Apr-2016 |
jhb |
Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.
This allows the contents of a TCB to be extracted from a T4/T5 card in DDB after a panic.
|
297482 |
01-Apr-2016 |
sephe |
tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with netinet/tcp_lro.c
Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com> Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5725
|
297467 |
31-Mar-2016 |
jhb |
Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially struct adapter) avoiding some foot shooting.
Discussed with: np Sponsored by: Chelsio Communications
|
297368 |
29-Mar-2016 |
np |
cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!" messages.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
|
297194 |
22-Mar-2016 |
np |
cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything to the descriptor ring no matter what path the frame takes within the driver's tx.
|
297124 |
21-Mar-2016 |
np |
iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.
This commit includes internal changesets 6785 8111 8149 8478 8617 8648 8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730 11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
|
296975 |
17-Mar-2016 |
np |
cxgbe(4): Tidy up PAUSE frame accounting.
Figure out if the chip is counting PAUSE frames in the "normal" stats and take them out if it is. This fixes a bug in the tx stats because the default hardware behavior is different for Tx and Rx but the driver was treating both the same way. The result was that OPACKETS, OBYTES, and OMCASTS were under-reported (if tx_pause > 0) before this change.
Note that the mac_stats sysctl still gives you the raw value of these statistics straight from the device registers.
|
296952 |
16-Mar-2016 |
np |
cxgbe(4): Enable PFs 0-3, and allow creation of SR-IOV VFs on these PFs in the default configuration files.
|
296951 |
16-Mar-2016 |
np |
cxgbe(4): Enable additional capabilities in the default configuration files. All features with FreeBSD drivers of some kind are now in the default configuration.
|
296950 |
16-Mar-2016 |
np |
cxgbe(4): Update some register settings in the default configuration files to match the "uwire" configuration.
|
296949 |
16-Mar-2016 |
np |
cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo. Do not display range if start = stop (this is a workaround for some unused regions).
|
296735 |
12-Mar-2016 |
dim |
Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not defined:
sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used
This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.
Reviewed by: np Differential Revision: https://reviews.freebsd.org/D5620
|
296711 |
12-Mar-2016 |
np |
cxgbe(4): Fix typo in previous commit.
|
296710 |
12-Mar-2016 |
np |
cxgbe(4): Catch up with the latest list of card capabilities as reported by the firmware.
|
296689 |
11-Mar-2016 |
np |
cxgbe(4): sysctls to display the TOE's TCP timers.
cask:~# sysctl -d dev.t5nex.0.toe dev.t5nex.0.toe.finwait2_timer: FINWAIT2 timer (us) dev.t5nex.0.toe.initial_srtt: Initial SRTT (us) dev.t5nex.0.toe.keepalive_intvl: Keepidle interval (us) dev.t5nex.0.toe.keepalive_idle: Keepidle idle timer (us) dev.t5nex.0.toe.persist_max: Persist timer max (us) dev.t5nex.0.toe.persist_min: Persist timer min (us) dev.t5nex.0.toe.rexmt_max: Retransmit max (us) dev.t5nex.0.toe.rexmt_min: Retransmit min (us) dev.t5nex.0.toe.dack_timer: DACK timer (us) dev.t5nex.0.toe.dack_tick: DACK tick (us) dev.t5nex.0.toe.timestamp_tick: TCP timestamp tick (us) dev.t5nex.0.toe.timer_tick: TP timer tick (us) ...
cask:~# sysctl dev.t5nex.0.toe dev.t5nex.0.toe.finwait2_timer: 9765440 dev.t5nex.0.toe.initial_srtt: 244128 dev.t5nex.0.toe.keepalive_intvl: 73240800 dev.t5nex.0.toe.keepalive_idle: 7031116800 dev.t5nex.0.toe.persist_max: 9765440 dev.t5nex.0.toe.persist_min: 976544 dev.t5nex.0.toe.rexmt_max: 9765440 dev.t5nex.0.toe.rexmt_min: 244128 dev.t5nex.0.toe.dack_timer: 19520 dev.t5nex.0.toe.dack_tick: 32.768 dev.t5nex.0.toe.timestamp_tick: 1048.576 dev.t5nex.0.toe.timer_tick: 32.768 ...
|
296641 |
11-Mar-2016 |
np |
cxgbe(4): Add sysctls to display the TP microcode version and the expansion rom version (if there's one).
trantor:~# sysctl dev.t4nex dev.t5nex | grep _version dev.t4nex.0.firmware_version: 1.15.28.0 dev.t4nex.0.tp_version: 0.1.9.4 dev.t5nex.0.firmware_version: 1.15.28.0 dev.t5nex.0.exprom_version: 1.0.0.68 dev.t5nex.0.tp_version: 0.1.4.9
|
296640 |
11-Mar-2016 |
np |
cxgbe(4): Add a sysctl for the event capture mask of the TP block's logic analyzer.
dev.t5nex.<n>.misc.tp_la_mask dev.t4nex.<n>.misc.tp_la_mask
|
296627 |
10-Mar-2016 |
np |
cxgbe(4): Improvements to the code that deals with the firmware's log.
- Query the location of the log very early during attach. Refresh the location later after establishing contact with the firmware. - Save the log's location as a flat address in devlog_params. - Use a memory window instead of backdoor access to the EDC/MC to read the log.
|
296624 |
10-Mar-2016 |
np |
cxgbe(4): Fix bug in r296603. The memory window needs to be repositioned if the start address isn't in the window already. One of the bounds check used the end address instead.
|
296603 |
10-Mar-2016 |
np |
cxgbe(4): Add general purpose routines that offer safe access to the chip's memory windows. Convert existing users of these windows to the new routines.
|
296596 |
10-Mar-2016 |
np |
cxgbe(4): Allow the addr/len pair that is being validated in validate_mem_range to span multiple memory types. Update validate_mt_off_len to use validate_mem_range.
|
296552 |
08-Mar-2016 |
np |
cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access to indirect registers only.
|
296544 |
08-Mar-2016 |
np |
cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with the internal shared code.
Obtained from: Chelsio Communications
|
296496 |
08-Mar-2016 |
np |
cxgbe(4): Minor updates to the shared routines that deal with firmware images.
|
296495 |
08-Mar-2016 |
np |
cxgbe(4): Fix t4_tp_get_rdma_stats.
|
296494 |
08-Mar-2016 |
np |
cxgbe(4): Many new functions in the shared code, unused at this time.
Obtained from: Chelsio Communications
|
296493 |
08-Mar-2016 |
np |
cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason the link is down, instead of doing it in OS specific code.
|
296491 |
08-Mar-2016 |
np |
cxgbe(4): Updates to shared routines that get/set various parameters via the firmware.
Obtained from: Chelsio Communications
|
296490 |
08-Mar-2016 |
np |
cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with internal shared code.
Obtained from: Chelsio Communications
|
296489 |
08-Mar-2016 |
np |
cxgbe(4): Updates to the shared routines that deal with the serial EEPROM, flash, and VPD.
Obtained from: Chelsio Communications
|
296488 |
08-Mar-2016 |
np |
cxgbe(4): Updates to mailbox routines in the shared code.
Obtained from: Chelsio Communications
|
296485 |
08-Mar-2016 |
np |
cxgbe(4): Update the interrupt handlers for hardware errors.
Obtained from: Chelsio Communications
|
296481 |
08-Mar-2016 |
np |
cxgbe(4): Overhaul the shared code that deals with the chip's TP block, which is responsible for filtering and RSS.
Add the ability to use filters that match on PF/VF (aka "VNIC id") while here. This is mutually exclusive with filtering on outer VLAN tag with Q-in-Q.
Sponsored by: Chelsio Communications
|
296478 |
08-Mar-2016 |
np |
cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters. Move the code that reads all the parameters to t4_init_sge_params in the shared code. Use these per-adapter values instead of globals.
Sponsored by: Chelsio Communications
|
296471 |
07-Mar-2016 |
np |
cxgbe(4): Updated register dumps.
- Get the list of registers to read during a regdump from the shared code instead of the OS specific code. This follows a similar move internally. The shared code includes the list for T6.
- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register dumps (and catch up with some updates to T4 and T5 register decode).
Obtained from: Chelsio Communications Sponsored by: Chelsio Communications
|
296383 |
04-Mar-2016 |
np |
cxgbe(4): Very basic T6 awareness. This is part of ongoing work to update to the latest internal shared code.
- Add a chip_params structure to keep track of hardware constants for all generations of Terminators handled by cxgbe. - Update t4_hw_pci_read_cfg4 to work with T6. - Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to work with T6. Most of the changes are in the decoders for the CIM logic analyzer and the MPS TCAM. - Acquire the regwin lock around indirect register accesses.
Obtained from: Chelsio Communications Sponsored by: Chelsio Communications
|
296333 |
03-Mar-2016 |
np |
cxgbe(4): First of many changes to reduce diffs with internal shared code:
- Rename some CamelCase variables. - s/t4_link_start/t4_link_l1cfg/g - Pull in t4_get_port_type_description. - Move t4_wait_op_done to t4_hw.c. - Flip the order of the RDMA stats. - Remove unsused function t4_iq_start_stop. - Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c
Obtained from: Chelsio Communications
|
296249 |
01-Mar-2016 |
np |
cxgbe(4): Update T5 and T4 firmwares to 1.15.28.0.
These firmwares were obtained from the beta "Chelsio T5/T4 Unified Wire v2.12.0.2 for Linux" release. Changes since last release are listed in the "Release Notes" accompanying the beta release and are copy-pasted here as well.
The plan is to have only GA'd firmwares in any -STABLE FreeBSD branch so I'll MFC this (after 2 months) only if it ends up in a GA release.
================================================================================ ================================================================================
22.1. T5 Firmware +++++++++++++++++++++++++++++++++
Version : 1.15.28.0 Date : 02/29/2016 ================================================================================
FIXES -----
BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress queue was ignored. - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Fixed 40G link failures with some switches when auto-negotiation enabled. - Fixed to improve on link bring-up time. - Per port buffer groups size doubled to improve performance. - Fixed an issue where bogus d3hot bits were set causing traffic stall. - Fixed an issue where sometimes adapter was not seen after reboot. - Fixed an issue where iWARP was crashing in conjunction with traffic management. - Fixed an issue where link failed to come up after removing twinax cable and inserting optical module.
OFLD - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.
FOiSCSI - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to use given inerface.
ENHANCEMENTS ------------
BASE: - Added new interface to program DCA settings in SGE contexts; allow 32-byte IQE size - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment. - Added MPS raw interface.
ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
OFLD: - WR opcode is returned to host in cqe error response.
================================================================================ ================================================================================
22.2. T4 Firmware +++++++++++++++++
Version : 1.15.28.0 Date : 02/29/2016 ================================================================================
FIXES -----
BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue was ignored. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Per port buffer groups size doubled to improve performance. - Fixed an issue where iWARP was crashing in conjunction with traffic management.
FOiSCSI: - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to use given inerface.
ENHANCEMENTS ------------
BASE: - Added MPS raw interface.
ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================
Obtained from: Chelsio Communications MFC after: 2 months Sponsored by: Chelsio Communications
|
296018 |
25-Feb-2016 |
np |
cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a port.
dev.cxgbe.<n>.max_speed dev.cxl.<n>.max_speed
Sponsored by: Chelsio Communications
|
295778 |
19-Feb-2016 |
np |
cxgbe: catch up with the latest hardware-related definitions.
Obtained from: Chelsio Communications Sponsored by: Chelsio Communications
|
295573 |
12-Feb-2016 |
np |
Remove duplicate definition (CPL_TRACE_PKT_T5).
|
295482 |
10-Feb-2016 |
glebius |
Garbage collect unused arguments of m_init().
|
294889 |
27-Jan-2016 |
glebius |
More fixes to the build.
|
294869 |
27-Jan-2016 |
glebius |
Augment struct tcpstat with tcps_states[], which is used for book-keeping the amount of TCP connections by state. Provides a cheap way to get connection count without traversing the whole pcb list.
Sponsored by: Netflix
|
294610 |
22-Jan-2016 |
np |
Fix for iWARP servers that listen on INADDR_ANY.
The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to represent an iWARP endpoint when the connection is over TCP. For servers the current approach is to invoke create_listen callback for each iWARP RNIC registered with the CM. This doesn't work too well for INADDR_ANY because a listen on any TCP socket already notifies all hardware TOEs/RNICs of the new listener. This patch fixes the server side of things for FreeBSD. We've tried to keep all these modifications in the iWARP/TCP specific parts of the OFED infrastructure as much as possible.
Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4801
|
294474 |
21-Jan-2016 |
np |
iw_cxgbe: fix a couple of problems int the RDMA_TERMINATE handler.
a) Look for the CPL in the payload buffer instead of the descriptor. b) Retrieve the socket associated with the tid with the inpcb lock held.
Submitted by: Krishnamraju Eraparaju @ Chelsio
|
294327 |
19-Jan-2016 |
hselasky |
Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets according to the hash type and flowid. This prevents exhaustion of the LRO entries due to too many connections at the same time. Testing using a larger number of higher bandwidth TCP connections showed that the incoming ACK packet aggregation rate increased from ~1.3:1 to almost 3:1. Another test showed that for a number of TCP connections greater than 16 per hardware receive ring, where 8 TCP connections was the LRO active entry limit, there was a significant improvement in throughput due to being able to fully aggregate more than 8 TCP stream. For very few very high bandwidth TCP streams, the optimizing LRO wrapper will add CPU usage instead of reducing CPU usage. This is expected. Network drivers which want to use the optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of "tcp_lro_flush()". Further the LRO control structure must be initialized using "tcp_lro_init_args()" passing a non-zero number into the "lro_mbufs" argument.
- Make LRO statistics 64-bit. Previously 32-bit integers were used for statistics which can be prone to wrap-around. Fix this while at it and update all SYSCTL's which expose LRO statistics.
- Ensure all data is freed when destroying a LRO control structures, especially leftover LRO entries.
- Reduce number of memory allocations needed when setting up a LRO control structure by precomputing the total amount of memory needed.
- Add own memory allocation counter for LRO.
- Bump the FreeBSD version to force recompilation of all KLDs due to change of the LRO control structure size.
Sponsored by: Mellanox Technologies Reviewed by: gallatin, sbruno, rrs, gnn, transport Tested by: Netflix Differential Revision: https://reviews.freebsd.org/D4914
|
293674 |
11-Jan-2016 |
np |
cxgbe: bind the ithreads that handle NIC rx to the correct CPU if the kernel is built with option RSS.
|
293309 |
07-Jan-2016 |
melifaro |
Convert cxgb/cxgbe to the new routing API.
Discussed with: np
|
293284 |
07-Jan-2016 |
glebius |
Historically we have two fields in tcpcb to describe sender MSS: t_maxopd, and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned up after T/TCP removal. After all permutations over the years the result is that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if timestamps are in action, or is equal to t_maxopd otherwise. That's a very rough estimate of MSS reduced by options length. Throughout the code it was used in places, where preciseness was not important, like cwnd or ssthresh calculations.
With this change:
- t_maxopd goes away. - t_maxseg now stores MSS not adjusted by options. - new function tcp_maxseg() is provided, that calculates MSS reduced by options length. The functions gives a better estimate, since it takes into account SACK state as well.
Reviewed by: jtl Differential Revision: https://reviews.freebsd.org/D3593
|
293185 |
05-Jan-2016 |
np |
iw_cxgbe: Shut down the socket but do not close the fd in case of error. The fd is closed later in this case. This fixes a "SS_NOFDREF on enter" panic.
Submitted by: Krishnamraju Eraparaju @ Chelsio Reviewed by: Steve Wise @ Open Grid Computing
|
292978 |
31-Dec-2015 |
melifaro |
Implement interface link header precomputation API.
Add if_requestencap() interface method which is capable of calculating various link headers for given interface. Right now there is support for INET/INET6/ARP llheader calculation (IFENCAP_LL type request). Other types are planned to support more complex calculation (L2 multipath lagg nexthops, tunnel encap nexthops, etc..).
Reshape 'struct route' to be able to pass additional data (with is length) to prepend to mbuf.
These two changes permits routing code to pass pre-calculated nexthop data (like L2 header for route w/gateway) down to the stack eliminating the need for other lookups. It also brings us closer to more complex scenarios like transparently handling MPLS nexthops and tunnel interfaces. Last, but not least, it removes layering violation introduced by flowtable code (ro_lle) and simplifies handling of existing if_output consumers.
ARP/ND changes: Make arp/ndp stack pre-calculate link header upon installing/updating lle record. Interface link address change are handled by re-calculating headers for all lles based on if_lladdr event. After these changes, arpresolve()/nd6_resolve() returns full pre-calculated header for supported interfaces thus simplifying if_output(). Move these lookups to separate ether_resolve_addr() function which ether returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr() compat versions to return link addresses instead of pre-calculated data.
BPF changes: Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT. Despite the naming, both of there have ther header "complete". The only difference is that interface source mac has to be filled by OS for AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside BPF and not pollute if_output() routines. Convert BPF to pass prepend data via new 'struct route' mechanism. Note that it does not change non-optimized if_output(): ro_prepend handling is purely optional. Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI. It is not needed for ethernet anymore. The only remaining FDDI user is dev/pdq mostly untouched since 2007. FDDI support was eliminated from OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).
Flowtable changes: Flowtable violates layering by saving (and not correctly managing) rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated header data from that lle.
Differential Revision: https://reviews.freebsd.org/D4102
|
292740 |
26-Dec-2015 |
np |
cxgbei: Hardware accelerated iSCSI target and initiator for TOE capable cards supported by cxgbe(4).
On the host side this driver interfaces with the storage stack via the ICL (iSCSI Common Layer) in the kernel. On the wire the traffic is standard iSCSI (SCSI over TCP as per RFC 3720/7143 etc.) that interoperates with all other standards compliant implementations. The driver is layered on top of the TOE driver (t4_tom) and promotes connections being handled by t4_tom to iSCSI ULP (Upper Layer Protocol) mode. Hardware assistance in this mode includes:
- Full TCP processing. - iSCSI PDU identification and recovery within the TCP stream. - Header and/or data digest insertion (tx) and verification (rx). - Zero copy (both tx and rx).
Man page will follow in a separate commit in a couple of weeks.
Relnotes: Yes Sponsored by: Chelsio Communications
|
292736 |
26-Dec-2015 |
np |
cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI offload driver. These changes come from projects/cxl_iscsi.
|
291856 |
05-Dec-2015 |
np |
Fix RSS build.
Reported by: arybchik@
|
291685 |
03-Dec-2015 |
kib |
Fix build for !TCP_OFFLOAD case.
|
291665 |
03-Dec-2015 |
jhb |
Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics. The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented as additional VIs on each port. This change allows additional non-netmap interfaces to be configured on each port. Additional virtual interfaces use the naming scheme vcxgbeX or vcxlX.
Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a value greater than 1 before loading the cxgbe(4) or cxl(4) driver. NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).
T4/T5 NICs provide a limited number of MAC addresses for each physical port. As a result, a maximum of six VIs can be configured on each port (including the "main" interface and the netmap interface when netmap is enabled).
One user-visible result is that when netmap is enabled, packets received or transmitted via the netmap interface are no longer counted in the stats for the "main" interface, but are not accounted to the netmap interface.
The netmap interfaces now also have a new-bus device and export various information sysctl nodes via dev.n(cxgbe|cxl).X.
The cxgbetool 'clearstats' command clears the stats for all VIs on the specified port along with the port's stats. There is currently no way to clear the stats of an individual VI.
Reviewed by: np MFC after: 1 month Sponsored by: Chelsio
|
290633 |
10-Nov-2015 |
np |
cxgbe/t4_tom: add a knob to the default configuration file to tune the TOE for LAN operation. It is possible to set this to other values (cluster for networks with little loss and really tight RTTs, and wan for relatively large RTTs and/or lossy networks) depending on the environment in which the TOE is being used.
None of this affects plain NIC operation in any way.
MFC after: 1 week
|
290416 |
05-Nov-2015 |
jhb |
Chelsio T5 chips do not properly echo the No Snoop and Relaxed Ordering attributes when replying to a TLP from a Root Port. As a workaround, disable No Snoop and Relaxed Ordering in the Root Port of each T5 adapter during attach so that CPU-initiated requests do not contain these flags.
Note that this affects CPU-initiated requests to all devices under this root port.
Reviewed by: np MFC after: 1 week Sponsored by: Chelsio
|
290175 |
30-Oct-2015 |
np |
cxgbe/tom: decide whether to shove segments or not only if there is payload to transmit.
MFC after: 1 week
|
289749 |
22-Oct-2015 |
hselasky |
Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a kernel programming interface module, KPI, to avoid confusion with the existing Linux userspace binary compatibility shims. Bump the FreeBSD_version number.
Reviewed by: np @ Suggested by: dumbbell @ Sponsored by: Mellanox Technologies
|
289578 |
19-Oct-2015 |
hselasky |
Merge LinuxKPI changes from DragonflyBSD: - Define the kref structure identical to the one found in Linux. - Update clients referring inside the kref structure. - Implement kref_sub() for FreeBSD.
Reviewed by: np @ Sponsored by: Mellanox Technologies
|
289401 |
16-Oct-2015 |
np |
cxgbe(4): support for the kernel RSS option.
You need PCBGROUP and RSS in the kernel config to use this.
Relnotes: Yes Sponsored by: Chelsio Communications
|
289338 |
14-Oct-2015 |
np |
iw_cxgbe: use correct RFC number.
|
289201 |
13-Oct-2015 |
np |
iw_cxgbe: MPA v2 is always available.
Submitted by: Krishnamraju Eraparaju at chelsio dot com Reviewed by: Steve Wise at opengridcomputing dot com
|
289103 |
10-Oct-2015 |
np |
iw_cxgbe: fix for page fault in cm_close_handler().
This is roughly the iw_cxgbe equivalent of https://github.com/torvalds/linux/commit/be13b2dff8c4e41846477b22cc5c164ea5a6ac2e ----------------- RDMA/cxgb4: Connect_request_upcall fixes
When processing an MPA Start Request, if the listening endpoint is DEAD, then abort the connection.
If the IWCM returns an error, then we must abort the connection and release resources. Also abort_connection() should not post a CLOSE event, so clean that up too.
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> -----------------
Submitted by: Krishnamraju Eraparaju at chelsio dot com.
|
287631 |
10-Sep-2015 |
jhb |
Add a comment that to clarify how to determine the amount of received DDP data.
Reviewed by: np Differential Revision: https://reviews.freebsd.org/D3619
|
286926 |
19-Aug-2015 |
np |
cxgbe(4): Save the flags for the last adapter-wide synchronized operation that was initiated successfully. (The caller and thread are already recorded).
MFC after: 1 week
|
286338 |
05-Aug-2015 |
np |
cxgbe(4): Update T5 and T4 firmwares bundled with the driver to 1.14.4.0. The changes in the firmwares since 1.11.27.0 are listed here (straight copy-paste from the "Release Notes.txt" accompanying the Chelsio Unified Wire 2.11.1.0 release on the website).
22.1. T5 Firmware +++++++++++++++++++++++++++++++++
Version : 1.14.4.0 Date : 08/05/2015 ================================================================================
FIXES -----
BASE: - Fixes a potential data path hang by properly programming PMTX congestion threshold settings. - Fixes a potential initialization error when accessing a configuration file stored on the flash. - Fixes a regression where SGE resources can be miss-sized if iWARP is disabled.
ETH: - Fixes a timing issue that would prevent CR4 links from coming up with some switches.
FOFCoE: - Defers fcoe linkdown mailbox command handling till LOGO is sent. - Updates vlan prio for all outstanding IOs during dcbx update.
ENHANCEMENTS ------------
BASE: - Adds support for PAUSE OFF watchdog. - Reports devlog access information in PCIE_FW_PF register 7.
ETH: - Enhances segmentation offload to include VxLAN and Geneve. - Adds PTP support. - Adds new interface to allow the driver to query the VI rss table base addresses. - Allows the driver to program the SGE ingrext contxt CongDrop field.
OFLD: - Adds new interface for the driver to specify offloaded connections TCP snd and rcv scale factors.
iSCSI: - Adds support for iscsi segmentatation offload (ISO). - Adds support for iscsi t10-dif offload.
FOiSCSI: - Sets FORCE_BIT for cut through processing for FOiSCSI.
FOFCoE: - Adds support for FCoE BB6. - Improves WRITE performance.
================================================================================ ================================================================================
Version : 1.13.32.0 Date : 03/25/2015 ================================================================================
FIXES -----
BASE: - Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of negative) - Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain adapter configurations) - Fixes config file based PL_TIMEOUT register programming
ETH: - Fixes a potential EO UDP SEG header corruption - Fixes an issue where 1000Base-X was not enabled correctly when using QSA modules
OFLD: - Fixes timeout issue with half-open connections - Fixes FW_FLOWC_WR processing when state is set to finwait1
FOFCoE: - Fixes fcoe xchg leaks in linkdown/peer down path - Fixes cleanup in FCoE linkdown and fixed buf timer flowid abuse - Fixes fw crash by clearing fcf flowc during bye
FOiSCSI: - Don't create a new tcp socket if ERL0 attempt has timed out.
ENHANCEMENTS ------------
BASE: - Adds support for VFs on PFs 4 to 7 - Adds support for QPs/CQs on any physical and virtual function
ETH: - Stops sending LACP frames on loopback interface - Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE - Adds support for CR4 links (BEAN/AEC on 40G TwinAx cables)
OFLD: - Improves default settings of LAN and CLUSTER TCP timer settings - Sends Negative Advice CPLs to software
FOISCSI: - Adds IPv6 support for foiscsi. Keeps backward compatibility with old foiscsi drivers which doesn't support ipv6.
FOFCoE: - Added fcoe debug support in flowc dump
================================================================================ ================================================================================
Version : 1.12.25.0 Date : 10/22/2014 ================================================================================
FIXES -----
BASE: - Improves precision of the Weight Round Robing Traffic Management Algorithm - Fixes an issue where the link would intermittently fail to come up - Fixes an issue where adapters with an external PHY couldn't run at 100Mbps - Fixes an issue where active optical cables were not recognized - Fixes link advertising issues on T520-BT (speed and pause frames) that would cause the link to negotiate unexpected settings - Forces link restart when auto-negotiation is disabled - Fix an issue where pause frames wouldn't be fully disabled even if requested
ETH: - Fixes NVGRE Segmentation Offload network header generation.
DCBX: - Fixes an issue where some settings were not being sent to the switch correctly - Fixes an issue where back-to-back DCBX port updates could get overwritten by FW - Fixes a firmware crash on DCBX APP information request before link up
FOiSCSI: - Fixes abort task leak in tmf response handling - Fixes TCP RST handling while in iSCSI ERL0 - Fixes a firmware crash on BYE without INIT
ENHANCEMENTS -------------
BASE: - Adds link partner settings reporting when available - Adds QSA support (in conjunction with QSA VPD) - Adds T520-BT LED support - Reports NOTSUPPORTED for modules with an unhandled identifier
DCBX: - Adds version reporting (indicating which version FW is trying to negotiate) - Adds IEEE support - Reports LLDP time outs
FOiSCSI: - Add support for multiple iSCSI DDP client - Sends DHCP renew request when lease expires
================================================================================
22.2. T4 Firmware +++++++++++++++++
Version : 1.14.4.0 Date : 08/05/2015 ================================================================================
FIXES -----
BASE: - Fixes a potential initialization error when accessing a configuration file stored on the flash. - Initialize PCIE_DBG_INDIR_REQ.Enable to 0, as hardware failed to do so and register dumps could result in errors.
ETH: - Fixes an issue that sometimes prevented the link from coming up in CR adapters.
ENHANCEMENTS ------------
BASE: - Adds support for PAUSE OFF watchdog. - Reports devlog access information in PCIE_FW_PF register 7.
ETH: - Adds new interface to allow the driver to query the VI rss table base addresses.
OFLD: - Adds new interface for the driver to specify offloaded connections TCP snd and rcv scale factors.
================================================================================ ================================================================================
Version : 1.13.32.0 Date : 03/25/2015 ================================================================================
FIXES -----
BASE: - Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of negative) - Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain adapter configurations) - Fixes config file based PL_TIMEOUT register programming
ETH: - Fixes a potential EO UDP SEG header corruption
OFLD: - Fixes timeout issue with half-open connections - Fixes FW_FLOWC_WR processing when state is set to finwait1
FOiSCSI: - Don't create a new tcp socket if ERL0 attempt has timed out.
ENHANCEMENTS ------------
ETH: - Stops sending LACP frames on loopback interface - Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE
OFLD: - Improves default settings of LAN and CLUSTER TCP timer settings - Sends Negative Advice CPLs to software
================================================================================ ================================================================================
Version : 1.12.25.0 Date : 10/22/2014 ================================================================================
FIXES -----
BASE: - Improves precision of the Weight Round Robing Traffic Management Algorithm - Forces link restart when auto-negotiation is disabled - Fix an issue where pause frames wouldn't be fully disabled even if requested
DCBX: - Fixes an issue where some settings were not being sent to the switch correctly - Fixes an issue where back-to-back DCBX port updates could get overwritten by FW - Fixes a firmware crash on DCBX APP information request before link up
FOiSCSI: - Fixes abort task leak in tmf response handling - Fixes TCP RST handling while in iSCSI ERL0 - Fixes a firmware crash on BYE without INIT
ENHANCEMENTS ------------
BASE: - Adds link partner settings reporting when available - Firmware now reports NOTSUPPORTED for modules with an unhandled identifier
DCBX: - Adds version reporting (indicating which version FW is trying to negotiate) - Adds IEEE support - Reports LLDP time outs
FOiSCSI: - Adds support for multiple iSCSI DDP clients - Sends DHCP renew request when lease expires
================================================================================
Obtained from: Chelsio Communications MFC after: 2 weeks Sponsored by: Chelsio Communications
|
286227 |
03-Aug-2015 |
jch |
Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability:
- The existing TCP INP_INFO lock continues to protect the global inpcb list stability during full list traversal (e.g. tcp_pcblist()).
- A new INP_LIST lock protects inpcb list actual modifications (inp allocation and free) and inpcb global counters.
It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input()) and INP_INFO_WLOCK only in occasional operations that walk all connections.
PR: 183659 Differential Revision: https://reviews.freebsd.org/D2599 Reviewed by: jhb, adrian Tested by: adrian, nitroboost-gmail.com Sponsored by: Verisign, Inc.
|
286107 |
31-Jul-2015 |
np |
cxgbe(4): initialize debug_flags from the kernel environment.
MFC after: 3 days
|
286001 |
29-Jul-2015 |
ae |
Convert in_ifaddr_lock and in6_ifaddr_lock to rmlock.
Both are used to protect access to IP addresses lists and they can be acquired for reading several times per packet. To reduce lock contention it is better to use rmlock here.
Reviewed by: gnn (previous version) Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3149
|
285648 |
17-Jul-2015 |
np |
cxgbe(4): Ask the firmware for the start of the RSS slice for a port and save it for later. This enables direct manipulation of the indirection tables (although the stock driver doesn't do that right now).
MFC after: 1 month
|
285527 |
14-Jul-2015 |
np |
cxgbe(4): Update T4 and T5 firmwares to 1.14.2.0.
Obtained from: Chelsio Communications MFC after: 3 days
|
285349 |
10-Jul-2015 |
luigi |
Sync netmap sources with the version in our private tree. This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following:
- fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application)
- exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey)
- revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html
- fix rx CRC handing on ixl
- add module dependencies for netmap when building drivers as modules
- minor simplifications to device-specific routines (*txsync, *rxsync)
- general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code,
Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open).
Those willing to try this code on stable/10 can just update the sys/dev/netmap/*, sys/net/netmap* with the version in HEAD and apply the small patches to individual device drivers.
MFC after: 1 month Sponsored by: (partly) Verisign, Cisco
|
285221 |
06-Jul-2015 |
np |
cxgbe(4): Add a new knob that controls the congestion response of netmap rx queues. The default is to drop rather than backpressure.
This decouples the congestion settings of NIC and netmap rx queues.
MFC after: 3 days
|
285220 |
06-Jul-2015 |
np |
cxgbe(4): Do not override the the global defaults for congestion drops. The hw.cxgbe.cong_drop knob is not affected by this change because the driver sets up congestion drop on a per-queue basis.
MFC after: 3 days
|
284988 |
01-Jul-2015 |
np |
cxgbe(4): request an automatic tx update when a netmap tx queue idles. The NIC tx queues already do this.
MFC after: 1 week Differential Revision:
|
284718 |
23-Jun-2015 |
np |
cxgbe: get_fl_payload returns a header mbuf when successful.
MFC after: 3 days
|
284445 |
16-Jun-2015 |
np |
cxgbe(4): Add the ability to dump mailbox commands and replies. It is enabled/disabled via bit 0 of adapter->debug_flags (which is available at dev.t5nex.<n>.debug_flags).
MFC after: 1 week
|
284007 |
05-Jun-2015 |
np |
cxgbe: set the minimum burst size when fetching fl buffers to 128B for netmap rx queues too. This should have gone in as part of r283858.
|
283864 |
01-Jun-2015 |
np |
cxgbe: no need to display the per-lane GT/s rating of the pcie link.
MFC after: 1 week
|
283858 |
01-Jun-2015 |
np |
cxgbe: set minimum burst size when fetching freelist buffers to 128B.
MFC after: 3 days
|
283291 |
22-May-2015 |
jkim |
CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks
|
282039 |
26-Apr-2015 |
glebius |
Don't use ifm_data. It was used only for self checking debug.
Reviewed by: np
|
281649 |
17-Apr-2015 |
glebius |
Provide functions to determine presence of a given address configured on a given interface.
Discussed with: np Sponsored by: Nginx, Inc.
|
280878 |
31-Mar-2015 |
np |
cxgbe/tom: return rx credits promptly if the socket buffer's low water mark cannot be reached because the window advertised to the peer isn't wide enough. While here, tweak the normal credit return too.
MFC after: 1 month
|
280706 |
26-Mar-2015 |
np |
cxgbe(4): provide the exact RSS hash type instead of a catch-all value to the upper layers.
|
280403 |
23-Mar-2015 |
np |
cxgbe(4): Do not call sbuf_trim on an sbuf with a drain function.
MFC after: 1 week
|
280146 |
16-Mar-2015 |
jhb |
Move special DDP handling for closing a connection into a new handle_ddp_close() function in t4_ddp.c as the logic is similar to handle_ddp_data(). This allows all knowledge of the special DDP mbufs to be private to t4_ddp.c as well.
|
279993 |
14-Mar-2015 |
ian |
Set the SBUF_INCLUDENUL flag in sbuf_new_for_sysctl() so that sysctl strings returned to userland include the nulterm byte.
Some uses of sbuf_new_for_sysctl() write binary data rather than strings; clear the SBUF_INCLUDENUL flag after calling sbuf_new_for_sysctl() in those cases. (Note that the sbuf code still automatically adds a nulterm byte in sbuf_finish(), but since it's not included in the length it won't get copied to userland along with the binary data.)
Remove explicit adding of a nulterm byte in a couple places now that it gets done automatically by the sbuf drain code.
PR: 195668
|
279984 |
14-Mar-2015 |
ian |
Revert r279934, r279938; this is going to be fixed in sbuf instead.
PR: 195668
|
279969 |
14-Mar-2015 |
np |
cxgbe(4): fix if_media handling for T520-BT cards. 1Gbps and 100Mbps are valid for this card.
MFC after: 1 week
|
279938 |
12-Mar-2015 |
ian |
Fix a paste-o, sb is already a pointer in this one.
|
279934 |
12-Mar-2015 |
ian |
Nullterminate strings returned via sysctl.
PR: 195668
|
279892 |
11-Mar-2015 |
jhb |
Resize receive socket buffers that support autosizing when receiving TCP data via direct data placement.
Sponsored by: Chelsio MFC after: 1 week
|
279701 |
06-Mar-2015 |
np |
cxgbe(4): experimental rx packet sink for netmap queues. This is not intended for general use.
MFC after: 1 month
|
279700 |
06-Mar-2015 |
np |
cxgbe(4): knobs to experiment with the interrupt coalescing timer for netmap rx queues, and the "batchiness" of rx updates sent to the chip.
These knobs will probably become per-rxq in the near future and will be documented only after their final form is decided.
MFC after: 1 month
|
279691 |
06-Mar-2015 |
np |
cxgbe(4): provide the correct size of freelists associated with netmap rx queues to the chip. This will fix many problems with native netmap rx on ncxl/ncxgbe interfaces.
MFC after: 1 week
|
279251 |
24-Feb-2015 |
np |
cxgbe(4): allow tx hardware checksumming on the netmap interface.
It is disabled by default but users can set IFCAP_TXCSUM on the netmap ifnet (ifconfig ncxl0 txcsum) to override netmap and force the hardware to calculate and insert proper IP and L4 checksums in outbound frames.
MFC after: 2 weeks
|
279246 |
24-Feb-2015 |
np |
cxgbe(4): set up congestion management for netmap rx queues.
The hw.cxgbe.cong_drop knob controls the response of the chip when netmap queues are congested.
|
279245 |
24-Feb-2015 |
np |
cxgbe(4): do not set the netmap rxq interrupts on a hair-trigger.
MFC after: 2 weeks
|
279244 |
24-Feb-2015 |
np |
cxgbe(4): wait for the hardware to catch up before destroying a netmap txq.
MFC after: 2 weeks
|
279243 |
24-Feb-2015 |
np |
cxgbe(4): request an automatic tx update when a netmap txq idles.
MFC after: 2 weeks
|
279092 |
20-Feb-2015 |
np |
cxgbe(4): there is no need to force an "unimplemented" panic needlessly. The calls to free_nm_txq and free_nm_rxq are made just a few lines prior to the panic.
|
278886 |
17-Feb-2015 |
hselasky |
Update the infiniband stack to Mellanox's OFED version 2.1.
Highlights: - Multiple verbs API updates - Support for RoCE, RDMA over ethernet
All hardware drivers depending on the common infiniband stack has been updated aswell.
Discussed with: np @ Sponsored by: Mellanox Technologies MFC after: 1 month
|
278485 |
10-Feb-2015 |
np |
cxgbe(4): allow the SET_FILTER_MODE ioctl to change the mode when it's safe to do so.
MFC after: 1 month
|
278374 |
08-Feb-2015 |
np |
cxgbe(4): tidy up some of the interaction between the Upper Layer Drivers (ULDs) and the base if_cxgbe driver.
Track the per-adapter activation of ULDs in a new "active_ulds" field. This was done pretty arbitrarily before this change -- via TOM_INIT_DONE in adapter->flags for TOM, and the (1 << MAX_NPORTS) bit in adapter->offload_map for iWARP.
iWARP and hw-accelerated iSCSI rely on the TOE (supported by the TOM ULD). The rules are: a) If the iWARP and/or iSCSI ULDs are available when TOE is enabled then iWARP and/or iSCSI are enabled too. b) When the iWARP and iSCSI modules are loaded they go looking for adapters with TOE enabled and enable themselves on that adapter. c) You cannot deactivate or unload the TOM module from underneath iWARP or iSCSI. Any such attempt will fail with EBUSY.
MFC after: 2 weeks
|
278372 |
08-Feb-2015 |
np |
cxgbe(4): adapter_full_init is always a synchronized operation.
MFC after: 1 week
|
278371 |
08-Feb-2015 |
np |
cxgbe(4): a change to the synchronization rules within the the driver. This is purely cosmetic because the new rules are already followed.
MFC after: 1 week
|
278342 |
07-Feb-2015 |
np |
cxgbe(4): fix a test made while enabling TOE.
MFC after: 1 week
|
278303 |
06-Feb-2015 |
np |
cxgbe(4): Add a minimal if_cxl module that pulls in the real driver as a dependency. This ensures "ifconfig cxl<n> ..." does the right thing even when it's run with no driver loaded.
if_cxl.ko is the tiniest module in /boot/kernel.
MFC after: 2 weeks
|
278239 |
05-Feb-2015 |
np |
cxgbe(4): reserve id for iSCSI upper layer driver.
|
277763 |
26-Jan-2015 |
jhb |
Lock the socket buffer before jumping to the 'out' label if sblock() fails in t4_soreceive_ddp().
|
277761 |
26-Jan-2015 |
jhb |
- Update a disabled KASSERT() to use sbused() instead of accessing the no-longer existant sb_cc sockbuf member. - Use sbavail() instead of sbused() in t4_soreceive_ddp() to match the usage in soreceive_stream() on which it is based.
Discussed with: glebius (2)
|
277759 |
26-Jan-2015 |
jhb |
Fix a couple of panics when detaching from a cxgbe/cxl interface that was never brought up: - Allow NULL to be passed to sglist_free(). - Don't try to stop an interface that was never fully initialized.
Reviewed by: np
|
277402 |
19-Jan-2015 |
hselasky |
Add missing linuxapi module dependencies and always use the FreeBSD "MODULE_VERSION" macro definition. Remove the redefinition of the "MODULE_VERSION" macro from the Linux kernel compatibility API.
MFC after: 1 month Reported by: np@ Sponsored by: Mellanox Technologies
|
277226 |
16-Jan-2015 |
np |
Allow cxgbe(4) to be built on i386. Driver attach will succeed only on a subset of i386 systems.
|
277135 |
13-Jan-2015 |
np |
cxgbe/iw_cxgbe: fix whitespace nit in r277102.
Reported by: stefanf@
|
277102 |
13-Jan-2015 |
np |
cxgbe/iw_cxgbe: allow any size during the initial MPA exchange.
MFC after: 1 month
|
276775 |
07-Jan-2015 |
np |
cxgbe/tom: allocate page pod addresses instead of ppod#.
MFC after: 2 weeks
|
276729 |
06-Jan-2015 |
np |
cxgbe/tom: use vmem(9) as the DDP page pod allocator.
MFC after: 1 month
|
276728 |
05-Jan-2015 |
np |
cxgbe(4): fix the description of a strange bunch of counters.
MFC after: 1 week
|
276597 |
03-Jan-2015 |
np |
cxgbe/tom: do not engage the TOE's payload chopper for payload < 2 MSS or for 10Gbps ports.
MFC after: 2 weeks
|
276574 |
02-Jan-2015 |
np |
cxgbe/tom: fix the MSS calculation for IPv6 connections handled by the TOE.
MFC after: 1 week
|
276570 |
02-Jan-2015 |
np |
cxgbe/tom: log some more details in send_flowc_wr.
MFC after: 1 week
|
276498 |
01-Jan-2015 |
np |
cxgbe(4): remove buf_ring specific restriction on the txq size.
MFC after: 2 months
|
276485 |
31-Dec-2014 |
np |
cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver lock or software queue has to get involved.
b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This is specifically for the tx multiqueue model where one of the if_transmit producer threads becomes the consumer and other producers carry on as usual. mp_ring is implemented as standalone code and it should be possible to use it in any driver with tx multiqueue. It also has: - the ability to enqueue/dequeue multiple items. This might become significant if packet batching is ever implemented. - an abdication mechanism to allow a thread to give up writing tx descriptors and have another if_transmit thread take over. A thread that's writing tx descriptors can end up doing so for an unbounded time period if a) there are other if_transmit threads continuously feeding the sofware queue, and b) the chip keeps up with whatever the thread is throwing at it. - accurate statistics about interesting events even when the stats come at the expense of additional branches/conditional code.
The NIC txq lock is uncontested on the fast path at this point. I've left it there for synchronization with the control events (interface up/down, modload/unload).
c) Add support for "type 1" coalescing work request in the normal NIC tx path. This work request is optimized for frames with a single item in the DMA gather list. These are very common when forwarding packets. Note that netmap tx in cxgbe already uses these "type 1" work requests.
d) Do not request automatic cidx updates every 32 descriptors. Instead, request updates via bits in individual work requests (still every 32 descriptors approximately). Also, request an automatic final update when the queue idles after activity. This means NIC tx reclaim is still performed lazily but it will catch up quickly as soon as the queue idles. This seems to be the best middle ground and I'll probably do something similar for netmap tx as well.
e) Implement a faster tx path for WRQs (used by TOE tx and control queues, _not_ by the normal NIC tx). Allow work requests to be written directly to the hardware descriptor ring if room is available. I will convert t4_tom and iw_cxgbe modules to this faster style gradually.
MFC after: 2 months
|
275808 |
15-Dec-2014 |
jhb |
Check for SS_NBIO in so->so_state instead of sb->sb_flags in soreceive_stream().
Differential Revision: https://reviews.freebsd.org/D1299 Reviewed by: bz, gnn MFC after: 1 week
|
275733 |
12-Dec-2014 |
np |
Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe code can use it too.
MFC after: 1 week
|
275554 |
06-Dec-2014 |
np |
cxgbe(4): allow the driver to use rx buffers that do not end on a pack boundary.
MFC after: 2 weeks
|
275539 |
06-Dec-2014 |
np |
cxgbe(4): Allow for different pad and pack boundaries for different adapters. Set the pack boundary for T5 cards to be the same as the PCIe max payload size. The chip likes it this way.
In this revision the driver allocate rx buffers that align on both boundaries. This is not a strict requirement and a followup commit will switch the driver to a more relaxed allocation strategy.
MFC after: 2 weeks
|
275358 |
01-Dec-2014 |
hselasky |
Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file.
This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows.
"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before.
Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped.
MFC after: 1 month Sponsored by: Mellanox Technologies
|
275329 |
30-Nov-2014 |
glebius |
Merge from projects/sendfile: extend protocols API to support sending not ready data: o Add new flag to pru_send() flags - PRUS_NOTREADY. o Add new protocol method pru_ready().
Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
275326 |
30-Nov-2014 |
glebius |
Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers. These mbufs are now being populated by some I/O in background and are referenced outside. This forces following implications: - An mbuf which is "not ready" can't be taken out of the buffer. - An mbuf that is behind a "not ready" in the queue neither. - If sockbet buffer is flushed, then "not ready" mbufs shouln't be freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc. The sb_ccc stands for ""claimed character count", or "committed character count". And the sb_acc is "available character count". Consumers of socket buffer API shouldn't already access them directly, but use sbused() and sbavail() respectively. o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones with M_BLOCKED. o New field sb_fnrdy points to the first not ready mbuf, to avoid linear search. o New function sbready() is provided to activate certain amount of mbufs in a socket buffer.
A special note on SCTP: SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet allow protocol specific sockbufs. Thus, SCTP does some hacks to make itself compatible with FreeBSD: it manages sockbufs on its own, but keeps sb_cc updated to inform the stack of amount of data in them. The new notion of "not ready" data isn't supported by SCTP. Instead, only a mechanical substitute is done: s/sb_cc/sb_ccc/. A proper solution would be to take away struct sockbuf from struct socket and allow protocols to implement their own socket buffers, like SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
274724 |
19-Nov-2014 |
np |
cxgbe(4): figure out the max payload size and save it for later.
MFC after: 1 week
|
274461 |
13-Nov-2014 |
np |
iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect fails.
Submitted by: hariprasad at chelsio dot com
|
274456 |
12-Nov-2014 |
np |
Fix some bad interaction between cxgbe(4) and lacp lagg(4) that could leave a port permanently disabled when a copper cable is unplugged and then plugged right back in.
lacp_linkstate goes looking for the current ifmedia on a link state change and it could get stale information from cxgbe(4) on a module unplug followed by replug. The fix is to process module events before link-state events within the driver, and to always rebuild the ifmedia list on a module change event (instead of rebuilding it lazily).
Thanks to asomers@ for the problem report and detailed analysis to go with it.
MFC after: 1 week
|
274421 |
12-Nov-2014 |
glebius |
In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions:
sbavail() and sbused()
Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
274402 |
11-Nov-2014 |
jhb |
Add device ID for the T502-BT (dual-port 1G) adapter.
Reviewed by: np MFC after: 1 week
|
274351 |
10-Nov-2014 |
np |
cxgbe(4): adjust PMRX and PMTX parameters.
MFC after: 1 week
|
273797 |
28-Oct-2014 |
np |
Always request a completion for every work request for iWARP. The initial MPA exchange must be tracked this way so that t4_tom's state for the tid is all clean at the time the tid transitions to RDMA mode. Once it does, t4_tom is out of the way and iw_cxgbe uses the qp endpoints directly.
Sponsored by: Chelsio Communications
|
273753 |
27-Oct-2014 |
np |
iwcm_event status needs to be populated for close_complete_upcall
Submitted by: Hariprasad at Chelsio dot com Sponsored by: Chelsio Communications
|
273750 |
27-Oct-2014 |
np |
Some cxgbe/iw_cxgbe fixes: - Free rt in c4iw_connect only if it is allocated. - Call soclose instead of so_shutdown if there is an abort from the peer. - Close socket and return failure if TOE is not enabled.
Submitted by: Hariprasad at Chelsio dot com Sponsored by: Chelsio Communications
|
273615 |
25-Oct-2014 |
np |
cxgbe(4): bump up PF4's share of some global resources.
This increases the size of the per-port RSS slice and also allows the driver to use a larger number of tx and rx queues.
MFC after: 2 weeks
|
273480 |
22-Oct-2014 |
np |
cxgbe/iw_cxgbe: wake up waiters after flushing the qp.
Obtained from: Chelsio
|
273377 |
21-Oct-2014 |
hselasky |
Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.
- Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes.
- Logical OR where binary OR was expected.
- Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs.
- Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function.
- Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement.
- Updated "EXAMPLES" section in SYSCTL manual page.
MFC after: 3 days Sponsored by: Mellanox Technologies
|
273135 |
15-Oct-2014 |
hselasky |
Update the OFED Linux compatibility layer and Mellanox hardware driver(s):
- Properly name an inclusion guard - Fix compile warnings regarding unsigned enums - Add two new sysctl nodes - Remove all empty linux header files - Make an error printout more verbose - Use "mod_delayed_work()" instead of cancelling and starting a timeout. - Implement more Linux scatterlist functions.
MFC after: 3 days Sponsored by: Mellanox Technologies
|
272719 |
07-Oct-2014 |
np |
cxgbe/tom: don't leak resources tied to an active open request that cannot be sent to the chip because a prerequisite L2 resolution failed.
Submitted by: Hariprasad at chelsio dot com (original version) MFC after: 2 weeks.
|
272200 |
27-Sep-2014 |
np |
cxgbe(4): implement if_get_counter.
|
272190 |
26-Sep-2014 |
np |
cxgbe(4): explicitly set various if_hw_tso* values.
MFC after: 3 days
|
272183 |
26-Sep-2014 |
np |
Make sure the adapter's management queue and the event queue are available before any uppper layer driver (TOE, iWARP, or iSCSI) registers with the base cxgbe(4) driver.
Submitted by: Hariprasad at chelsio dot com Reviewed by: np@
|
272080 |
24-Sep-2014 |
np |
Update comment (missed this bit in r272079).
|
272079 |
24-Sep-2014 |
np |
cxgbe/tom: Catch up with r271119, syncache_add doesn't need tcbinfo lock.
|
272051 |
23-Sep-2014 |
np |
cxgbe(4): Verify that the addresses in if_multiaddrs really are multicast addresses. (The chip doesn't really care, it's just that it needs to be told explicitly if unicast DMACs are checked for "hits" in the hash that is used after the TCAM entries are all used up).
|
271490 |
12-Sep-2014 |
np |
cxgbe(4): add support for the SIOCGI2C ioctl.
|
271450 |
12-Sep-2014 |
np |
cxgbe(4): knobs to enable/disable PAUSE frame based flow control.
MFC after: 1 week
|
271420 |
11-Sep-2014 |
rwatson |
Add new a M_START() mbuf macro that returns a pointer to the start of an mbuf's storage (internal or external).
Add a new M_SIZE() mbuf macro that returns the size of an mbuf's storage (internal or external).
These contrast with m_data and m_len, which are with respect to data in the buffer, rather than the buffer itself.
Rewrite M_LEADINGSPACE() and M_TRAILINGSPACE() in terms of M_START() and M_SIZE().
This is done as we currently have many instances of using mbuf flags to generate pointers or lengths for internal storage in header and regular mbufs, as well as to external storage. Rather than replicate this logic throughout the network stack, centralising the implementation will make it easier for us to refine mbuf storage. This should also help reduce bugs by limiting the amount of mbuf-type-specific pointer arithmetic. Followup changes will propagate use of the macros throughout the stack.
M_SIZE() conflicts with one macro in the Chelsio driver; rename that macro in a slightly unsatisfying way to eliminate the collision.
MFC after: 3 days Obtained from: jeff (with enhancements) Sponsored by: EMC / Isilon Storage Division Reviewed by: bz, glebius, np Differential Revision: https://reviews.freebsd.org/D753
|
271328 |
09-Sep-2014 |
np |
Whitespace nit.
MFC after: 1 week
|
270710 |
27-Aug-2014 |
hselasky |
- Update the OFED Linux Emulation layer as a preparation for a hardware driver update from Mellanox Technologies. - Remove empty files from the OFED Linux Emulation layer. - Fix compile warnings related to printf() and the "%lld" and "%llx" format specifiers. - Add some missing 2-clause BSD copyrights. - Add "Mellanox Technologies, Ltd." to list of copyright holders. - Add some new compatibility files. - Fix order of uninit in the mlx4ib module to avoid crash at unload using the new module_exit_order() function.
MFC after: 1 week Sponsored by: Mellanox Technologies
|
270063 |
16-Aug-2014 |
luigi |
Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate.
Basically no user API changes (some bugfixes in sys/net/netmap_user.h).
In detail:
1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode.
2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial
3. (kernel) simplify the prototype for *txsync() and *rxsync() driver methods. All netmap drivers affected, changes mostly mechanical.
4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully.
5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps).
A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts.
Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features.
This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline.
A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella.
MFC after: 3 days.
|
269731 |
08-Aug-2014 |
np |
cxgbe(4): Do not poke T4-only registers on a T5 (and vice versa).
Obtained from: Chelsio Communications MFC after: 1 week
|
269644 |
06-Aug-2014 |
np |
cxgbe(4): Let caller specify whether it's ok to sleep in t4_sched_config and t4_sched_params.
MFC after: 2 weeks
|
269537 |
04-Aug-2014 |
np |
cxgbe(4): Do not run any sleepable code in the SIOCSIFFLAGS handler when IFF_PROMISC or IFF_ALLMULTI is being flipped. bpf(4) holds its global mutex around ifpromisc in at least the bpf_dtor path.
MFC after: 3 days
|
269440 |
02-Aug-2014 |
np |
cxgbe(4): Remove an unused version of t4_enable_vi.
MFC after: 2 weeks
|
269428 |
02-Aug-2014 |
np |
cxgbe(4): some optimizations in freelist handling.
MFC after: 2 weeks.
|
269413 |
02-Aug-2014 |
np |
cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell address of an egress queue.
MFC after: 2 weeks
|
269411 |
02-Aug-2014 |
np |
cxgbe(4): minor optimizations in ingress queue processing.
Reorganize struct sge_iq. Make the iq entry size a compile time constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.
MFC after: 2 weeks
|
269076 |
24-Jul-2014 |
np |
Some hooks in cxgbe(4) for the offloaded iSCSI driver.
(I'm committing this on behalf of my colleagues in the Storage team at Chelsio).
Submitted by: Sreenivasa Honnur <shonnur at chelsio dot com> Sponsored by: Chelsio Communications.
|
269032 |
23-Jul-2014 |
np |
cxgbe(4): Keep track of the clusters that have to be freed by the custom free routine (rxb_free) in the driver. Fail MOD_UNLOAD with EBUSY if any such cluster has been handed up to the kernel but hasn't been freed yet. This prevents a panic later when the cluster finally needs to be freed but rxb_free is gone from the kernel.
MFC after: 1 week
|
268989 |
22-Jul-2014 |
np |
Add missing newline to an error message.
MFC after: 3 days
|
268971 |
22-Jul-2014 |
np |
Simplify r267600, there's no need to distinguish between allocated and inlined mbufs.
MFC after: 1 week
|
268706 |
15-Jul-2014 |
np |
cxgbe(4): Display CF facility correctly in the device log.
MFC after: 3 days
|
268640 |
15-Jul-2014 |
np |
Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The firmware allows up to 48B to be read this way but the driver limits itself to 8B at a time to remain compatible with old cxgbetool binaries.
MFC after: 1 week
|
268536 |
11-Jul-2014 |
np |
cxgbe(4): Add an iSCSI softc to the adapter structure.
|
268529 |
11-Jul-2014 |
glebius |
All mbuf external free functions never fail, so let them be void.
Sponsored by: Nginx, Inc.
|
267992 |
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
267985 |
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
267961 |
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
267757 |
22-Jun-2014 |
np |
cxgbe(4): Update the bundled T4 and T5 firmwares to versions 1.11.27.0.
Obtained from: Chelsio MFC after: 3 days
|
267689 |
20-Jun-2014 |
np |
Consider the total number of descriptors available (and not just those that are ready to be reclaimed) when deciding whether to resume tx after a stall.
MFC after: 3 days
|
267600 |
18-Jun-2014 |
np |
cxgbe(4): Fix bug in the fast rx buffer recycle path. In some cases rx buffers were getting recycled when they should have been left alone.
MFC after: 3 days
|
267548 |
16-Jun-2014 |
attilio |
- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them.
Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker.
This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted.
Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
|
267082 |
05-Jun-2014 |
np |
cxgbe(4): Properly account for the freelist buffers used when returning early from service_iq due to a budget restriction. This fixes a potential rx hang when using INTx.
MFC after: 3 days
|
266908 |
30-May-2014 |
np |
cxgbe(4): Fix a NULL dereference when the very first call to get_scatter_segment() in get_fl_payload() fails. While here, fix the code to adjust fl_bufs_used when a failure occurs for any other scatter segment.
MFC after: 3 days
|
266757 |
27-May-2014 |
np |
cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards. Netmap gets its own hardware-assisted virtual interface and won't take over or disrupt the "normal" interface in any way. You can use both simultaneously.
For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface (note the 'n' prefix) in the hardware to accompany each cxl<N> interface. These two ifnet's per port share the same wire but really are separate interfaces in the hardware and software. Each gets its own L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You should run netmap on the 'n' interfaces only, that's what they are for.
With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now. Single port receive is at 33Mpps but this is very much a work in progress. I expect it to be closer to 40Mpps once done. In any case the current effort can already saturate multiple 10G ports of a T5 card at the smallest legal packet size. T4 gear is totally untested.
trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43:ab:cd:ef 881.952141 main [1621] interface is ncxl0 881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0 881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0 881.962540 main [1804] mapped 334980KB at 0x801dff000 Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43:ab:cd:ef) 881.962562 main [1882] Sending 512 packets every 0.000000000 s 881.962563 main [1884] Wait 2 secs for phy reset 884.088516 main [1886] Ready... 884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1 884.088607 sender_body [996] start 884.093246 sender_body [1064] drop copy 885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec) 886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec) 887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec) 888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec) 889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec) 890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec) 891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec) 892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec) 893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec) 894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec) 895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec) 896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec) ...
Relnotes: Yes Sponsored by: Chelsio Communications.
|
266596 |
23-May-2014 |
bz |
Move the tcp_fields_to_host() and tcp_fields_to_net() (inline) functions to the tcp_var.h header file in order to avoid further duplication with upcoming commits.
Reviewed by: np MFC after: 2 weeks
|
266571 |
23-May-2014 |
np |
cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet.
|
264621 |
17-Apr-2014 |
emax |
use correct (integer) type for the temperature sysctl
Reviewed by: np, scottl Obtained from: Netflix MFC after: 3 days
|
263457 |
21-Mar-2014 |
np |
cxgbe(4): Recognize the "spider" configuration where a T5 card's 40G QSFP port is presented as 4 distinct 10G SFP+ ports to the driver.
MFC after: 2 weeks
|
263415 |
20-Mar-2014 |
np |
cxgbe(4): Use ifi_oqdrops in if_data to count drops in the tx path.
|
263412 |
20-Mar-2014 |
np |
cxgbe(4): if_iqdrops statistic should include tunnel congestion drops.
MFC after: 1 week
|
263317 |
18-Mar-2014 |
np |
cxgbe(4): significant rx rework.
- More flexible cluster size selection, including the ability to fall back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in case an allocation of a larger size fails. - A single get_fl_payload() function that assembles the payload into an mbuf chain for any kind of freelist. This replaces two variants: one for freelists with buffer packing enabled and another for those without. - Buffer packing with any sized cluster. It was limited to 4K clusters only before this change. - Enable buffer packing for TOE rx queues as well. - Statistics and tunables to go with all these changes. The driver's man page will be updated separately.
MFC after: 5 weeks
|
261907 |
14-Feb-2014 |
dim |
In cxgbe, conditionalize the t4_pgprot_wc() function, since it is only used when DOT5 is defined.
Reviewed by: np MFC after: 3 days
|
261558 |
06-Feb-2014 |
scottl |
Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable, hw.cxgbe.rsrv_noflow. When set, queue 0 of the port is reserved for TX packets without a flowid. The hash value of packets with a flowid is bumped up by 1. The intent is to provide a private queue for link-level packets like LACP that is unlikely to overflow or suffer deep queue latency.
Reviewed by: np Obtained from: Netflix MFC after: 3 days
|
261537 |
06-Feb-2014 |
np |
cxgbe(4): Use the rx channel map (instead of the tx channel map) as the congestion channel map.
MFC after: 1 week
|
261536 |
06-Feb-2014 |
np |
cxgbe(4): The T5 allows for a different freelist starvation threshold for queues with buffer packing. Use the correct value to calculate a freelist's low water mark.
MFC after: 1 week
|
261533 |
06-Feb-2014 |
np |
cxgbe(4): Use the port's tx channel to identify it to t4_clr_port_stats.
MFC after: 3 days
|
260210 |
02-Jan-2014 |
adrian |
Add an option to enable or disable the small RX packet copying that is done to improve performance of small frames.
When doing RX packing, the RX copying isn't necessarily required.
Reviewed by: np
|
259527 |
17-Dec-2013 |
np |
Do not create a hardware IPv6 server if the listen address is not in6addr_any and is not in the CLIP table either. This fixes a reported TOE+IPv6 NULL-dereference panic in do_pass_open_rpl().
While here, stop creating hardware servers for any loopback address. It's just a waste of server tids.
MFC after: 1 week
|
259382 |
14-Dec-2013 |
np |
Read card capabilities after firmware initialization, instead of setting them up as part of firmware initialization (which the driver gets to do only if it's the master driver).
Read the range of tids available for the ETHOFLD functionality if it's enabled.
New is_ftid() and is_etid() functions to test whether a tid falls within the range of filter tids or ETHOFLD tids respectively.
MFC after: 2 weeks
|
259150 |
10-Dec-2013 |
adrian |
Print out the full PCIe link negotiation during dmesg.
I found this useful when checking whether a NIC is in a PCIE 3.0 8x slot or not.
Reviewed by: np Sponsored by: Netflix, inc.
|
259145 |
09-Dec-2013 |
np |
Unstaticize t4_list and t4_uld_list. This works around a clang annoyance[1] and allows kgdb to find these symbols.
[1] http://lists.freebsd.org/pipermail/freebsd-hackers/2012-November/041166.html
MFC after: 3 days
|
259103 |
08-Dec-2013 |
np |
cxgbe(4): save a copy of the RSS map for each port for the driver's use.
|
258879 |
03-Dec-2013 |
np |
cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program scheduling classes in the chip and to bind tx queue(s) to a scheduling class respectively. These can be used for various kinds of tx traffic throttling (to force selected tx queues to drain at a fixed Kbps rate, or a % of the port's total bandwidth, or at a fixed pps rate, etc.).
Obtained from: Chelsio
|
258689 |
27-Nov-2013 |
np |
Disable an assertion that relies on some code[1] that isn't in HEAD yet.
[1] http://lists.freebsd.org/pipermail/freebsd-net/2013-August/036573.html
|
258441 |
21-Nov-2013 |
np |
cxgbe(4): update the internal list of device features.
MFC after: 3 days
|
257772 |
07-Nov-2013 |
np |
cxgbe(4): Tidy up the display for payload memory statistics (pm_stats).
# sysctl -n dev.t4nex.0.misc.pm_stats # sysctl -n dev.t5nex.0.misc.pm_stats
MFC after: 1 week
|
257654 |
04-Nov-2013 |
np |
cxgbe(4): Exclude MPS_RPLC_MAP_CTL (0x11114) from the register dump. Turns out it's a write-only register with strange side effects on read.
Submitted by: gnn MFC after: 3 days
|
257324 |
29-Oct-2013 |
glebius |
- Provide necessary includes. - Remove unnecessary includes.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
257241 |
28-Oct-2013 |
glebius |
Include necessary headers that now are available due to pollution via if_var.h.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
257176 |
26-Oct-2013 |
glebius |
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
256714 |
18-Oct-2013 |
np |
Fix typo in previous commit.
|
256713 |
18-Oct-2013 |
np |
iw_cxgbe should have a dependency on t4nex.
Reported by: trasz@
|
256694 |
17-Oct-2013 |
np |
iw_cxgbe: iWARP driver for Chelsio T4/T5 chips. This is a straight port of the iw_cxgb4 found in OFED distributions.
Obtained from: Chelsio
|
256477 |
14-Oct-2013 |
np |
cxgbe(4): Store the log2 of the # of doorbells per BAR2 page for both ingress and egress queues, and for both T4 and T5. These values are used by the T4/T5 iWARP driver.
|
256459 |
14-Oct-2013 |
np |
cxgbe(4): Update T4 and T5 firmwares to 1.9.12.0
|
256218 |
09-Oct-2013 |
glebius |
There are some high performance NICs that count statistics in hardware, and there are ifnets, that do that via counter(9). Provide a flag that would skip cache line trashing '+=' operation in ether_input().
Sponsored by: Netflix Sponsored by: Nginx, Inc. Reviewed by: melifaro, adrian Approved by: re (marius)
|
256131 |
07-Oct-2013 |
dim |
Fix kernel build on amd64 after r256118, since the machine/md_var.h header is not implicitly included there. So include it explicitly.
Approved by: re (delphij) Pointy hat to: dim MFC after: 3 days X-MFC-With: r256118
|
256118 |
07-Oct-2013 |
dim |
Remove redundant declaration of cpu_clflush_line_size in sys/dev/cxgbe/t4_sge.c, to silence a gcc warning.
Approved by: re (gjb) MFC after: 3 days
|
255411 |
09-Sep-2013 |
np |
Rework the tx credit mechanism between the cxgbe/tom driver and the card. This helps smooth out some burstiness in the exchange.
Approved by: re (glebius)
|
255410 |
09-Sep-2013 |
np |
Fix a miscalculation that caused cxgbe/tom to auto-increment a TOE socket's tx buffer size too aggressively.
Approved by: re (delphij)
|
255198 |
03-Sep-2013 |
np |
For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is set to 15 to indicate that the peer did not send a window scale option with its SYN. Do not send a window scale option in the SYN|ACK reply in that case.
|
255052 |
30-Aug-2013 |
np |
Fix the sysctl that displays whether buffer packing is enabled or not.
|
255050 |
30-Aug-2013 |
np |
Implement support for rx buffer packing. Enable it by default for T5 cards.
This is a T4 and T5 chip feature which lets the chip deliver multiple Ethernet frames in a single buffer. This is more efficient within the chip, in the driver, and reduces wastage of space in rx buffers.
- Always allocate rx buffers from the jumbop zone, no matter what the MTU is. Do not use the normal cluster refcounting mechanism. - Reserve space for an mbuf and a refcount in the cluster itself and let the chip DMA multiple frames in the rest. - Use the embedded mbuf for the first frame and allocate mbufs on the fly for any additional frames delivered in the cluster. Each of these mbufs has a reference on the underlying cluster.
|
255015 |
29-Aug-2013 |
np |
Merge r254386 from user/np/cxl_tuning. Add an INET|INET6 check missing in said revision.
r254386: Flush inactive LRO entries periodically.
|
255011 |
28-Aug-2013 |
np |
Whitespace nit.
|
255006 |
28-Aug-2013 |
np |
Change t4_list_lock and t4_uld_list_lock from mutexes to sx'es.
- tom_uninit had to be reworked not to hold the adapter lock (a mutex) around t4_deactivate_uld, which acquires the uld_list_lock. - the ifc_match for the interface cloner that creates the tracer ifnet had to be reworked as the kernel calls ifc_match with the global if_cloners_mtx held.
|
255005 |
28-Aug-2013 |
np |
Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update a couple of assertions in the TOE driver as well.
|
254933 |
26-Aug-2013 |
np |
Use correct mailbox and PCIe PF number when querying RDMA parameters.
|
254727 |
23-Aug-2013 |
np |
There is no need to hold the freelist lock around alloc/free of software descriptors. This also silences WITNESS warnings when the software descriptors are allocated with M_WAITOK.
MFC after: 1 week
|
254577 |
20-Aug-2013 |
np |
Display P/N information in the description.
Submitted by: gnn MFC after: 3 days
|
253890 |
02-Aug-2013 |
np |
Display temperature sensor data. Shows -1 if sensor not available on the card.
# sysctl dev.t4nex.0.temperature # sysctl dev.t5nex.0.temperature
|
253889 |
02-Aug-2013 |
np |
Fix previous commit (r253873). "cong" has one bit per channel but the congestion channel map has 1 nibble per channel. So bits wxyz need to be blown up into 000w000x000y000z.
|
253873 |
01-Aug-2013 |
np |
Set up congestion manager context properly for T5 based cards.
MFC after: 3 days (will check with re@)
|
253829 |
31-Jul-2013 |
np |
Display SGE tunables in the sysctl tree.
dev.t5nex.0.fl_pktshift: payload DMA offset in rx buffer (bytes) dev.t5nex.0.fl_pad: payload pad boundary (bytes) dev.t5nex.0.spg_len: status page size (bytes) dev.t5nex.0.cong_drop: congestion drop setting
Discussed with: scottl
|
253701 |
27-Jul-2013 |
np |
Display a string instead of a numeric code in the linkdnrc sysctl.
Submitted by: gnn@
|
253699 |
27-Jul-2013 |
np |
Expand the list of devices claimed by cxgbe(4).
|
253691 |
26-Jul-2013 |
np |
Add support for packet-sniffing tracers to cxgbe(4). This works with all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and for general purpose monitoring without tapping any cxgbe or cxl ifnet directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as they were received from or transmitted on the wire. On transmit, a tracer will capture a frame after TSO segmentation, hw VLAN tag insertion, hw L3 & L4 checksum insertion, etc. It will also capture frames generated by the TCP offload engine (TOE traffic is normally invisible to the kernel). On receive, a tracer will capture a frame before hw VLAN extraction, runt filtering, other badness filtering, before the steering/drop/L2-rewrite filters or the TOE have had a go at it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction (tx or rx). For now cxgbetool will set up tracers to capture the first 128B of every transmitted or received frame on a given port. This is a small subset of what the hardware can do. A pseudo ifnet with the same name as the nexus driver (t4nex0 or t5nex0) will be created for tracing. The data delivered to this ifnet is an additional copy made inside the chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */ # cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting) # cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving) # cxgbetool t5nex0 tracer list # tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K "frames" with no L3/L4 checksum but this will show you the frames that were actually transmitted.
/* all done */ # cxgbetool t5nex0 tracer 0 disable # cxgbetool t5nex0 tracer 1 disable # cxgbetool t5nex0 tracer list # ifconfig t5nex0 destroy
|
253688 |
26-Jul-2013 |
np |
Reserve room for ioctls that aren't in this copy of the driver yet.
|
253407 |
17-Jul-2013 |
np |
Specify a timeout for the PL block.
MFC after: 3 days
|
253217 |
11-Jul-2013 |
np |
Attach to the 4x10G T540-CR card.
|
252747 |
05-Jul-2013 |
np |
- Show the reason why link is down if this information is available. - Display the temperature and PHY firmware version of the BT PHY.
MFC after: 1 day
|
252728 |
04-Jul-2013 |
np |
- Make note of interface MTU change if the rx queues exist, and not just when the interface is up. - Add a tunable to control the TOE's rx coalesce feature (enabled by default as it always has been). Consider the interface MTU or the coalesce size when deciding which cluster zone to use to fill the offload rx queue's free list. The tunable is: dev.{t4nex,t5nex}.<N>.toe.rx_coalesce
MFC after: 1 day
|
252724 |
04-Jul-2013 |
np |
On-the-fly changes to the interrupt coalescing timer should apply to the TOE rx queues too.
MFC after: 1 day
|
252716 |
04-Jul-2013 |
np |
Pay attention to TCP_NODELAY when it's set/unset after the connection is established.
MFC after: 1 day
|
252715 |
04-Jul-2013 |
np |
Ring the egress queue's doorbell as soon as there are 8 or more descriptors ready to be processed.
MFC after: 1 day
|
252711 |
04-Jul-2013 |
np |
The T5 allows the driver to specify the ISS. Do so; use the ISS picked by the kernel.
MFC after: 1 day
|
252705 |
04-Jul-2013 |
np |
- Read all TP parameters in one place. - Read the filter mode, calculate various shifts, and use them properly during active open (in select_ntuple).
MFC after: 1 day
|
252661 |
04-Jul-2013 |
np |
- Include the T5 firmware with the driver. - Update the T4 firmware to the latest. - Minor reorganization and updates to the version macros, etc.
Obtained from: Chelsio MFC after: 1 day
|
252469 |
01-Jul-2013 |
np |
Add a sysctl to get the number of filters available.
sysctl dev.t4nex.<N>.nfilters sysctl dev.t5nex.<N>.nfilters
MFC after: 3 days
|
252312 |
27-Jun-2013 |
np |
Update T5 register ranges. This is so that regdump skips over registers with read side-effects.
MFC after: 3 days
|
251638 |
11-Jun-2013 |
np |
cxgbe/tom: Allow caller to select the queue (control or data) used to send the CPL_SET_TCB_FIELD request in t4_set_tcb_field().
MFC after: 1 week
|
251518 |
08-Jun-2013 |
np |
cxgbe/tom: Fix bad signed/unsigned mixup in the stid allocator. This fixes a panic when allocating a mixture of IPv6 and IPv4 stids.
MFC after: 1 week
|
251434 |
05-Jun-2013 |
np |
cxgbe(4): Never install a firmware if hw.cxgbe.fw_install is 0.
MFC after: 1 week
|
251358 |
04-Jun-2013 |
np |
cxgbe(4): Provide accurate hit count for filters on T5 cards. The location within the TCB and the size have both changed.
MFC after: 1 week
|
251213 |
01-Jun-2013 |
np |
cxgbe(4): Some more debug sysctls. These work on both T4 and T5 based cards.
dev.t5nex.0.misc.cim_ma_la: CIM MA logic analyzer dev.t5nex.0.misc.cim_pif_la: CIM PIF logic analyzer dev.t5nex.0.misc.mps_tcam: MPS TCAM entries dev.t5nex.0.misc.tp_la: TP logic analyzer dev.t5nex.0.misc.ulprx_la: ULPRX logic analyzer
Obtained from: Chelsio MFC after: 1 week
|
250697 |
16-May-2013 |
kib |
Add dependencies on the firmware, which allows the loading of the cxgb and cxgbe modules.
Reviewed and approved by: np MFC after: 1 week
|
250614 |
13-May-2013 |
np |
Deal correctly with 40G ports that don't have any transceiver plugged in. Do not claim that they have unknown tranceivers.
MFC after: 3 days
|
250221 |
03-May-2013 |
np |
cxgbe: Switch to a better way to install firmware.
MFC after: 1 week
|
250218 |
03-May-2013 |
np |
cxgbe/tom: Do not use M_PROTO1 to mark rx zero-copy mbufs as special. All the M_PROTOn flags are clobbered when an mbuf is appended to the socket buffer.
MFC after: 1 week
|
250117 |
30-Apr-2013 |
np |
Fix DDP breakage introduced in r248925. Bitwise OR has higher precedence than ternary conditional.
MFC after: 1 week
|
250093 |
30-Apr-2013 |
np |
Attach to the T580 (2 x 40G) card.
MFC after: 1 week.
|
250092 |
30-Apr-2013 |
np |
- Provide accurate ifmedia information so that 40G ports/transceivers are displayed properly in ifconfig, etc.
- Use the same number of tx and rx queues for a 40G port as for a 10G port.
MFC after: 1 week
|
250090 |
30-Apr-2013 |
np |
cxgbe(4): Some updates to shared code.
Obtained from: Chelsio MFC after: 1 week
|
249629 |
18-Apr-2013 |
np |
cxgbe(4): Refuse to install T5 firmwares on a T4 card (and vice versa).
MFC after: 1 week
|
249627 |
18-Apr-2013 |
np |
cxgbe/tom: Update the CLIP table on the chip when there are changes to the list of IPv6 addresses on the system. The table is used for TOE+IPv6 only.
|
249393 |
11-Apr-2013 |
np |
Add pciids of the T5 based cards. The ones that I haven't tested with cxgbe(4) are disabled for now. This will change.
MFC after: 2 weeks
|
249392 |
11-Apr-2013 |
np |
Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).
MFC after: 3 days.
|
249391 |
11-Apr-2013 |
np |
Auto-reduce the holdoff timers that are greater than the maximum value allowed by the hardware.
MFC after: 3 days
|
249385 |
11-Apr-2013 |
np |
cxgbe/tom: Slight simplification of code that calculates options2.
MFC after: 3 days
|
249383 |
11-Apr-2013 |
np |
Get rid of a couple of stray \n's.
MFC after: 3 days.
|
249382 |
11-Apr-2013 |
np |
There is no need for elaborate queries and error checking when trying to set FW4MSG_ENCAP.
MFC after: 3 days
|
249376 |
11-Apr-2013 |
np |
- Explain clearly why a different firmware is being installed (if/when it is being installed). Improve other error messages while here.
- Select special FPGA specific configuration profile when appropriate.
MFC after: 3 days
|
249370 |
11-Apr-2013 |
np |
cxgbe(4): Ensure that the MOD_LOAD handler runs before either t4nex or t5nex attach to their devices.
MFC after: 3 days
|
248925 |
30-Mar-2013 |
np |
cxgbe(4): Add support for Chelsio's Terminator 5 (aka T5) ASIC. This includes support for the NIC and TOE features of the 40G, 10G, and 1G/100M cards based on the T5.
The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4) has been updated instead of writing a brand new driver. T5 cards will show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.
Sponsored by: Chelsio
|
247355 |
26-Feb-2013 |
np |
cxgbe(4): Report unusual out of band errors from the firmware.
Obtained from: Chelsio MFC after: 5 days
|
247347 |
26-Feb-2013 |
np |
cxgbe(4): Consider all the API versions of the interfaces exported by the firmware (instead of just the main firmware version) when evaluating firmware compatibility. Document the new "hw.cxgbe.fw_install" knob being introduced here.
This should fix kern/173584 too. Setting hw.cxgbe.fw_install=2 will mostly do what was requested in the PR but it's a bit more intelligent in that it won't reinstall the same firmware repeatedly if the knob is left set.
PR: kern/173584 MFC after: 5 days
|
247291 |
26-Feb-2013 |
np |
cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating them in a firmware message if it is able to do so. This works out better for one of the FIFOs in the chip.
MFC after: 5 days
|
247289 |
26-Feb-2013 |
np |
cxgbe(4): Update firmware to 1.8.4.0.
MFC after: 5 days
|
247122 |
21-Feb-2013 |
np |
cxgbe(4): Add sysctls to extract debug information from the chip:
dev.t4nex.X.misc.cim_la logic analyzer dump dev.t4nex.X.misc.cim_qcfg queue configuration dev.t4nex.X.misc.cim_ibq_xxx inbound queues dev.t4nex.X.misc.cim_obq_xxx outbound queues
Obtained from: Chelsio MFC after: 1 week
|
247062 |
20-Feb-2013 |
np |
cxgbe(4): Assume that CSUM_TSO in the transmit path implies CSUM_IP and CSUM_TCP too. They are all set explicitly by the kernel usually.
While here, fix an unrelated bug where hardware L4 checksum calculation was accidentally disabled for some IPv6 packets.
Reported by: alfred@ MFC after: 3 days
|
246575 |
09-Feb-2013 |
np |
Do not hold locks around hardware context reads.
MFC after: 3 days
|
246385 |
06-Feb-2013 |
np |
Busy-wait when cold.
Reported by: gnn, jhb MFC after: 3 days
|
246093 |
29-Jan-2013 |
np |
Provide a statistic to track the number of drops in each of the port's txq's buf_ring. The aggregate for all the queues of a port is already provided in ifnet->if_snd.ifq_drops.
MFC after: 3 days.
|
245937 |
26-Jan-2013 |
np |
Install an extra hold on the newly allocated synq entry so that it cannot be freed while do_pass_accept_req is running. This closes a race where do_pass_establish on another CPU (the driver chose a different queue for the new tid) expands the synq entry into a full PCB and then releases the only hold on it, all while do_pass_accept_req is still running.
MFC after: 3 days
|
245936 |
26-Jan-2013 |
np |
Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.
MFC after: 3 days
|
245935 |
26-Jan-2013 |
np |
Add a couple of missing error codes. Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative advice and not a fatal error.
MFC after: 3 days
|
245933 |
26-Jan-2013 |
np |
cxgbe/tom: List IFCAP_TOE6 as supported now that all the required pieces are in place. You still have to enable it explicitly, after loading the t4_tom KLD.
|
245567 |
17-Jan-2013 |
np |
cxgbe: Make the for_each macros safer to use by turning them into a single statement each.
Submitted by: Christoph Mallon <christoph dot mallon at gmx dot de> MFC after: 1 week
|
245518 |
17-Jan-2013 |
np |
cxgbe: Do a more thorough job in the CLEAR_STATS ioctl.
MFC after: 3 days
|
245517 |
16-Jan-2013 |
np |
cxgbe: Fix the for_each_foo macros -- the last argument should not share its name with any member of struct sge.
MFC after: 3 days
|
245468 |
15-Jan-2013 |
np |
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (passive open).
MFC after: 1 week
|
245467 |
15-Jan-2013 |
np |
cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (active open).
MFC after: 1 week
|
245448 |
15-Jan-2013 |
np |
cxgbe/tom: Basic CLIP table management.
This is the Compressed Local IPv6 table on the chip. To save space, the chip uses an index into this table instead of a full IPv6 address in some of its hardware data structures.
For now the driver fills this table with all the local IPv6 addresses that it sees at the time the table is initialized. I'll improve this later so that the table is updated whenever new IPv6 addresses are configured or existing ones deleted.
MFC after: 1 week
|
245441 |
15-Jan-2013 |
np |
cxgbe/tom: Miscellaneous updates for TOE+IPv6 support (more to follow).
- Teach find_best_mtu_idx() to deal with IPv6 endpoints.
- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is enabled.
- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included without having to drag in t4_msg.h too. This was bothering the iWARP driver for some reason.
MFC after: 1 week
|
245434 |
14-Jan-2013 |
np |
cxgbe(4): Updates to the hardware L2 table management code.
- Add full support for IPv6 addresses.
- Read the size of the L2 table during attach. Do not assume that PCIe physical function 4 of the card has all of the table to itself.
- Use FNV instead of Jenkins to hash L3 addresses and drop the private copy of jhash.h from the driver.
MFC after: 1 week
|
245276 |
11-Jan-2013 |
np |
Overhaul the stid allocator so that it can be used for IPv6 servers too. The entry for an IPv6 server in the TCAM takes up the equivalent of two ordinary stids and must be properly aligned too.
MFC after: 1 week
|
245274 |
11-Jan-2013 |
np |
cxgbe(4): Add functions to help synchronize "slow" operations (those not on the fast data path) and use them instead of frobbing the adapter lock and busy flag directly.
Other changes made while reworking all slow operations: - Wait for the reply to a filter request (add/delete). This guarantees that the operation is complete by the time the ioctl returns. - Tidy up the tid_info structure. - Do not allow the tx queue size to be set to something that's not a power of 2.
MFC after: 1 week
|
245243 |
09-Jan-2013 |
np |
cxgbe(4): updates to the configuration file that controls how hardware resources are partitioned.
- Reduce the number of virtual interfaces reserved for PF4. This leaves spare room in the source MAC table and allows the driver to setup filters that rewrite the source MAC address.
- Reduce the number of filters and use the freed up space for the CLIP (Compressed Local IPv6 addresses) table. This is a prerequisite for IPv6 TOE support which will follow separately in a series of commits.
MFC after: 1 week
|
244580 |
22-Dec-2012 |
np |
cxgbe(4): Add support for the T440-LP-CR card. This is the 4x10G low profile card with a QSFP+ transceiver.
MFC after: 3 days
|
244551 |
21-Dec-2012 |
np |
cxgbe(4): must hold a write-lock on the table while allocating an L2 entry for switching.
MFC after: 3 days
|
243857 |
04-Dec-2012 |
glebius |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags in sys/dev.
|
243681 |
29-Nov-2012 |
np |
cxgbe/tom: Handle the case where the chip falls out of DDP mode by itself. The hole in the receive sequence space corresponds to the number of bytes placed directly up to that point.
MFC after: 1 week
|
243680 |
29-Nov-2012 |
np |
cxgbe/tom: Add a flag to indicate that the L2 table entry for an embryonic connection has been setup and never attempt to abort a tid before this is done. This fixes a bad race where a listening socket is closed when the driver is in the middle of step (b) here. The symptom of this were "ARP miss" errors from the driver followed by tid leaks.
A hardware-offloaded passive open works this way:
a) A SYN "hits" the TCAM entry for a server tid and the chip delivers it to the queue associated with the server tid (say, queue A). It waits for a response from the driver telling it what to do.
b) The driver decides it is ok to proceed. It adds the new tid to the list of embryonic connections associated with the server tid and then hands off the SYN to the kernel's syncache to make sure that the kernel okays it too. If it does then the driver provides an L2 table entry, queue id (say, queue B), etc. and instructs the chip to send the SYN/ACK response.
c) The chip delivers a status to queue B depending on how the third step of the 3-way handshake goes. The driver removes the tid from its list of embryonic connections and either expands the syncache entry or destroys the tid. In any case all subsequent messages for the new tid will be delivered to queue B, not queue A. Anything running in queue B knows that the L2 entry has long been setup and the new flag is of no interest from here on. If the listener is closed it will deal with so_comp as normal.
MFC after: 1 week
|
243110 |
16-Nov-2012 |
np |
cxgbe/tom: Plug mbuf leak.
MFC after: 3 days
|
242671 |
06-Nov-2012 |
np |
Make sure the inp hasn't been dropped before trying to access its socket and tcpcb.
MFC after: 3 days
|
242666 |
06-Nov-2012 |
np |
Remove the tid from the software table (and bump down the in-use counter) when the syncache doesn't want the driver to reply to an incoming SYN. This fixes a harmless bug where tids_in_use would go out of sync with the hardware counter.
MFC after: 3 days
|
241733 |
19-Oct-2012 |
ed |
Prefer __containerof() over __member2struct().
The former works better with qualifiers, but also properly type checks the input pointer.
|
241642 |
17-Oct-2012 |
np |
Always provide sndbuf and MSS values in a flowc command, even when the driver is going to abort the connection right after the flowc.
MFC after: 3 days
|
241626 |
17-Oct-2012 |
np |
Whitespace cleanup.
MFC after: 3 days
|
241494 |
12-Oct-2012 |
np |
Temporary fix for kern/172364.
PR: kern/172364 MFC after: 3 days
|
241493 |
12-Oct-2012 |
np |
Use global knob in the TP_PARA_REG3 register to disable congestion drops if the user has chosen this behaviour.
MFC after: 3 days
|
241409 |
10-Oct-2012 |
np |
Add a driver ioctl to clear a port's MAC statistics.
Submitted by: gnn@ MFC after: 3 days
|
241399 |
10-Oct-2012 |
np |
Add a driver ioctl to read a byte from any device on a port's i2c bus. This lets userspace read arbitrary information from the SFP+ modules etc. on this bus.
Reading multiple bytes in the same transaction isn't possible right now. I'll update the driver once the chip's firmware supports this.
MFC after: 3 days
|
241398 |
10-Oct-2012 |
np |
There is no need to report the same error twice.
MFC after: 3 days
|
241397 |
10-Oct-2012 |
np |
Remove unused item. cxgbe's rx queue's lock was removed a long time ago.
MFC after: 3 days
|
241394 |
10-Oct-2012 |
kevlo |
Revert previous commit...
Pointyhat to: kevlo (myself)
|
241370 |
09-Oct-2012 |
kevlo |
Prefer NULL over 0 for pointers
|
240693 |
19-Sep-2012 |
gavin |
Switch some PCI register reads from using magic numbers to using the names defined in pcireg.h
MFC after: 1 week
|
240680 |
18-Sep-2012 |
gavin |
Align the PCI Express #defines with the style used for the PCI-X #defines. This also has the advantage that it makes the names more compact, iand also allows us to correct the non-uniform naming of the PCIM_LINK_* defines, making them all consistent amongst themselves.
This is a mostly mechanical rename: s/PCIR_EXPRESS_/PCIER_/g s/PCIM_EXP_/PCIEM_/g s/PCIM_LINK_/PCIEM_LINK_/g
When this is MFC'd, #defines will be added for the old names to assist out-of-tree drivers.
Discussed with: jhb MFC after: 1 week
|
240453 |
13-Sep-2012 |
np |
Install interrupt handlers early, during attach, for the reason explained in r239913 by jhb.
MFC after: 1 week
|
240452 |
13-Sep-2012 |
np |
Use native FreeBSD facilities everywhere except the shared code in common/
MFC after: 1 week
|
240443 |
13-Sep-2012 |
np |
Update interface to firmware 1.6.2 and include the firmware in the driver.
Obtained from: Chelsio MFC after: 1 week
|
239544 |
21-Aug-2012 |
np |
Deal with the case where a syncache entry added by the TOE driver is evicted from the syncache but a later syncache_expand succeeds because of syncookies. The TOE driver has to resort to more direct means to install its hooks in the socket in this case.
|
239528 |
21-Aug-2012 |
np |
Avoid a NULL pointer dereference.
|
239527 |
21-Aug-2012 |
np |
Cannot hold a mutex around vm_fault_quick_hold_pages, so don't. Tweak some comments while here.
|
239514 |
21-Aug-2012 |
np |
Minor cleanup: use bitwise ops instead of pointless wrappers around setbit/clrbit.
|
239511 |
21-Aug-2012 |
np |
Correctly handle the case where an inp has already been dropped by the time the TOE driver reports that an active open failed. toe_connect_failed is supposed to handle this but it should be provided the inpcb instead of the tcpcb which may no longer be around.
|
239344 |
17-Aug-2012 |
np |
Support for TCP DDP (Direct Data Placement) in the T4 TOE module.
Basically, this is automatic rx zero copy when feasible. TCP payload is DMA'd directly into the userspace buffer described by the uio submitted in soreceive by an application.
- Works with sockets that are being handled by the TCP offload engine of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an "ifconfig +toe" on the cxgbe interface). - Does not require any modification to the application. - Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.
|
239341 |
16-Aug-2012 |
np |
Initialize various DDP parameters in the main cxgbe(4) driver:
- Setup multiple DDP page sizes. When the driver attempts DDP it will try to combine physically contiguous pages into regions of these sizes.
- Set the indicate size such that the payload carried in the indicate can be copied in the header mbuf (and the 16K rx buffer can be recycled).
- Set DDP threshold to the max payload that the chip will coalesce and deliver to the driver (this is ~16K by default, which is also why the offload rx queue is backed by 16K buffers). If the chip is able to coalesce up to the max it's allowed to, it's a good sign that the peer is transmitting in bulk without any TCP PSH.
MFC after: 2 weeks
|
239339 |
16-Aug-2012 |
np |
Make room for DDP page pods in the default configuration profile. While here, bump up the L2 table's size to 4K entries.
MFC after: 2 weeks
|
239338 |
16-Aug-2012 |
np |
Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware TCB. Filters are programmed by modifying the TCB too (via a different routine) and the reply to any TCB update is delivered via a CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or something else and route it appropriately.
MFC after: 2 weeks
|
239336 |
16-Aug-2012 |
np |
Allow for a different handler for each type of firmware message.
MFC after: 2 weeks
|
239266 |
15-Aug-2012 |
np |
The size of the buffers in an Ethernet freelist has to be higher than the interface's MTU. Initialize such freelists with correct values.
This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048 and 9216 in size) happened to have enough spare room. I ran into it when playing around with unusual MTUs.
MFC after: 2 weeks
|
239259 |
14-Aug-2012 |
np |
if_iqdrops should include frames truncated within the chip.
MFC after: 2 weeks
|
239258 |
14-Aug-2012 |
np |
Convert some fixed parameters to tunables (with reasonable default values).
- cong_drop specifies what to do on congestion: nothing, backpressure, or drop. - fl_pktshift specifies the padding before Ethernet payload. - fl_pad specifies the boundary upto which to pad Ethernet payload. - spg_len controls the length of the status page.
MFC after: 2 weeks
|
239102 |
06-Aug-2012 |
dim |
In sys/dev/cxgbe/firmware/t4fw_interface.h, change the enum 'fw_hdr_intfver' into an anonymous enum, which avoids a clang 3.2 warning about all the enum values being the same value.
Reviewed by: np MFC after: 1 week
|
238313 |
09-Jul-2012 |
np |
Fix a bug in code that calculates the number of the first interrupt vector for a port. This affected the gigabit ports of T422 cards (the ones with 2x10G ports and 2x1G ports).
MFC after: will check with re@
|
238054 |
03-Jul-2012 |
np |
Fix inverted test that resulted in incorrect multicast hw programming.
|
238028 |
02-Jul-2012 |
np |
Instruct the firmware not to provision resources for TCP offload if the kernel is being built without TCP_OFFLOAD. But never override toecaps_allowed if it has been set manually.
|
237831 |
30-Jun-2012 |
np |
- Assign (don't OR) the CSUM_XXX bits to csum_flags in the rx checksum code. - Fix TSO/TSO4 mixup. - Add IFCAP_LINKSTATE to the available/enabled capabilities.
|
237819 |
29-Jun-2012 |
np |
cxgbe(4): support for IPv6 TSO and LRO.
Submitted by: bz (this is a modified version of that patch)
|
237799 |
29-Jun-2012 |
np |
cxgbe(4): support for IPv6 hardware checksumming (rx and tx).
|
237587 |
26-Jun-2012 |
np |
Allow cxgbe(4) running within a VM to attach to its devices that have been exported via PCI passthrough.
- Do not check for a specific physical function (PF) before claiming a device. Different PFs have different device-ids so this check is redundant anyway.
- Obtain the PF# from the WHOAMI register instead of pci_get_function().
- Setup the memory windows using the real BAR0 address, not what the VM says it is.
Obtained from: Chelsio Communications
|
237512 |
23-Jun-2012 |
np |
Better way to determine the status page length and rx pad boundary.
|
237463 |
22-Jun-2012 |
np |
Do not allocate extra vectors when adapter is not TOE capable (or toecaps have been disallowed by the user).
+ one very minor unrelated cleanup in t4_sge.c
|
237439 |
22-Jun-2012 |
np |
Do not read registers with read side effects while performing a register dump for cxgbetool.
|
237436 |
22-Jun-2012 |
np |
cxgbe(4): update to firmware interface 1.5.2.0; updates to shared code.
|
237263 |
19-Jun-2012 |
np |
- Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features.
- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon.
Build-tested with make universe.
30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m | grep TOE
Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe
Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp | grep toe # sockstat -46c | grep toe
Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
|
235944 |
24-May-2012 |
bz |
MFp4 bz_ipv6_fast:
Significantly update tcp_lro for mostly two things: 1) introduce basic support for IPv6 without extension headers. 2) try hard to also get the incremental checksum updates right, especially also in the IPv4 case for the IP and TCP header.
Move variables around for better locality, factor things out into functions, allow checksum updates to be compiled out, ...
Leave a few comments on further things to look at in the future, though that is not the full list.
Update drivers with appropriate #includes as needed for IPv6 data type in LRO.
Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole) MFC After: 3 days
|
234833 |
30-Apr-2012 |
np |
Change the default to not use packet counters to generate rx interrupts. Rely solely on the timer based mechanism.
Update man page to reflect this change.
MFC after: 1 week
|
234831 |
30-Apr-2012 |
np |
Make sure that the firmware version is available in dev.t4nex.X.firmware_version even if the driver fails to attach properly. At least it'll be easy to tell what we're dealing with.
MFC after: 1 week
|
231592 |
13-Feb-2012 |
np |
Use the non-sleeping variang of t4_wr_mbox in code that can be called with locks held.
MFC after: 1 day
|
231172 |
08-Feb-2012 |
np |
Program the MAC exact match table in batches of 7 addresses at a time when possible. This is more efficient than one at a time.
Submitted by: gnn MFC after: 3 days
|
231120 |
07-Feb-2012 |
np |
Acquire the adapter lock before updating fields of the filter structure.
Submitted by: gnn (different version) MFC after: 3 days
|
231116 |
07-Feb-2012 |
np |
Remove if_start from cxgb and cxgbe.
Submitted by: jhb MFC after: 3 days
|
231115 |
07-Feb-2012 |
np |
cxgbe: reduce diffs with other branches. Will help future MFCs from HEAD.
MFC after: 3 days
|
228561 |
16-Dec-2011 |
np |
Many updates to cxgbe(4)
- Device configuration via plain text config file. Also able to operate when not attached to the chip as the master driver.
- Generic "work request" queue that serves as the base for both ctrl and ofld tx queues.
- Generic interrupt handler routine that can process any event on any kind of ingress queue (via a dispatch table).
- A couple of new driver ioctls. cxgbetool can now install a firmware to the card ("loadfw" command) and can read the card's memory ("memdump" and "tcb" commands).
- Lots of assorted information within dev.t4nex.X.misc.* This is primarily for debugging and won't show up in sysctl -a.
- Code to manage the L2 tables on the chip.
- Updates to cxgbe(4) man page to go with the tunables that have changed.
- Updates to the shared code in common/
- Updates to the driver-firmware interface (now at fw 1.4.16.0)
MFC after: 1 month
|
228491 |
14-Dec-2011 |
np |
Do not clobber the ingress queue's congestion setting.
MFC after: 1 month
|
228443 |
12-Dec-2011 |
mdf |
Do not define bool/true/false if the symbols already exist.
MFC after: 2 weeks Sponsored by: Isilon Systems, LLC
|
227843 |
22-Nov-2011 |
marius |
- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel - While at it, use DEVMETHOD_END. Discussed with: jhb - Also while at it, use __FBSDID.
|
227309 |
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
222973 |
11-Jun-2011 |
np |
- driver ioctl to get SGE context for any given queue. - sysctls to display the context id, cidx, and pidx of all kinds of queues.
MFC after: 3 days
|
222703 |
04-Jun-2011 |
np |
Cause backpressure (instead of dropping frames) on congestion.
MFC after: 3 days
|
222701 |
04-Jun-2011 |
np |
Allow lazy fill up of freelists.
MFC after: 3 days
|
222552 |
01-Jun-2011 |
np |
Provide hit-count with rest of the information about a filter.
MFC after: 1 week
|
222551 |
31-May-2011 |
np |
Firmware device log.
# sysctl dev.t4nex.0.devlog
MFC after: mdf's sysctl+sbuf changes are MFC'd
|
222513 |
30-May-2011 |
np |
Update to firmware interface 1.3.10
MFC after: 1 week
|
222510 |
30-May-2011 |
np |
- Specialized ingress queues that take interrupts for other ingress queues. Try to have a set of these per port when possible, fall back to sharing a common pool between all ports otherwise.
- One control queue per port (used to be one per hardware channel).
- t4_eth_rx now handles Ethernet rx only.
- sysctls to display pidx/cidx for some queues.
MFC after: 1 week
|
222509 |
30-May-2011 |
np |
L2 table code. This is enough to get the T4's switch + L2 rewrite filters working. (All other filters - switch without L2 info rewrite, steer, and drop - were already fully-functional).
Some contrived examples of "switch" filters with L2 rewriting:
# cxgbetool t4nex0 iport 0 dport 80 action switch vlan +9 eport 3 Intercept all packets received on physical port 0 with TCP port 80 as destination, insert a vlan tag with VID 9, and send them out of port 3.
# cxgbetool t4nex0 sip 192.168.1.1/32 ivlan 5 action switch \ vlan =9 smac aa:bb:cc:dd:ee:ff eport 0 Intercept all packets (received on any port) with source IP address 192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac to aa:bb:cc:dd:ee:ff, and send it out of port 0.
MFC after: 1 week
|
222102 |
19-May-2011 |
np |
Simplify t4_os_find_pci_capability.
MFC after: 3 days
|
222085 |
18-May-2011 |
np |
- Enable per-channel congestion notification. - Enable PCIe relaxed ordering for all egress queues and rx data buffers.
MFC after: 3 days
|
222003 |
17-May-2011 |
np |
Add missing header. The test for VLAN_CAPABILITIES later in the file doesn't make sense without it.
MFC after: 3 days
|
221911 |
14-May-2011 |
np |
sysctl that displays the absolute queue id of an rxq.
|
221516 |
05-May-2011 |
np |
Bump up the number of egress queues that the driver is allowed to use.
MFC after: 3 days
|
221477 |
05-May-2011 |
np |
T4 packet timestamps.
Reference code that shows how to get a packet's timestamp out of cxgbe(4). Disabled by default because we don't have a standard way today to pass this information up the stack.
The timestamp is 60 bits wide and each increment represents 1 tick of the T4's core clock. As an example, the timestamp granularity is ~4.4ns for this card:
# sysctl dev.t4nex.0.core_clock dev.t4nex.0.core_clock: 228125
MFC after: 1 week
|
221474 |
05-May-2011 |
np |
T4 packet filtering/steering.
- Enable 5-tuple and every-packet lookup.
- Setup the default filter mode to allow filtering/steering based on IP protocol, ingress port, inner VLAN ID, IP frag, FCoE, and MPS match type; all combined together. You can also filter based on MAC index, Ethernet type, IP TOS/IPv6 Traffic Class, and outer VLAN ID but you'll have to modify the default filter mode and exclude some of the match-fields in it.
IPv4 and IPv6 SIP/DIP/SPORT/DPORT are always available in all filter rules.
- Add driver ioctls to get/set the global filter mode.
- Add driver ioctls to program and delete hardware filters. A couple of the "switch" actions that rewrite Ethernet and VLAN information and switch the packet out of another port may not work as the L2 code is not yet in place. Everything else, including all "drop" and "pass" rules with RSS or absolute qid, should work.
Obtained from: Chelsio Communications
|
221464 |
04-May-2011 |
np |
Always re-arm an iq's interrupt before leaving the handler.
MFC after: 1 week
|
220905 |
20-Apr-2011 |
np |
Ring the freelist doorbell from within refill_fl. While here, fix a bug that could have allowed the hardware pidx to reach the cidx even though the freelist isn't empty. (Haven't actually seen this but it was there waiting to happen..)
MFC after: 1 week
|
220897 |
20-Apr-2011 |
np |
Use the correct free routine when destroying a control queue.
X-MFC after: r220873
|
220874 |
19-Apr-2011 |
np |
Use Toeplitz hash for RSS.
MFC after: 3 days
|
220873 |
19-Apr-2011 |
np |
- Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is now a suitable base for all kinds of egress queues.
- Add control queues (sge_ctrlq) and allocate one of these per hardware channel. They can be used to program filters and steer traffic (and more).
MFC after: 1 week
|
220649 |
15-Apr-2011 |
np |
Fix a couple of bad races that can occur when a cxgbe interface is taken down. The ingress queue lock was unused and has been removed as part of these changes.
- An in-flight egress update from the SGE must be handled before the queue that requested it is destroyed. Wait for the update to arrive.
- Interrupt handlers must stop processing rx events for a queue before the queue is destroyed. Events that have not yet been processed should be ignored once the queue disappears.
MFC after: 1 week
|
220643 |
14-Apr-2011 |
np |
There is no need to request a tx credit flush if such a request is already pending.
MFC after: 3 days
|
220410 |
07-Apr-2011 |
np |
Modify read/write ioctls to work with 64 bit registers too.
MFC after: 3 days
|
220232 |
01-Apr-2011 |
np |
Update header and related code for firmware 1.3.8
MFC after: 3 days
|
219944 |
24-Mar-2011 |
np |
Do not over-allocate MSI interrupts for the case where each ingress queue has its own interrupt. If the exact number that we need is not a power of 2 and we're using MSI, then switch to interrupt multiplexing.
While here, replace the magic numbers with something more readable.
MFC after: 3 days
|
219883 |
22-Mar-2011 |
np |
Fix an error while constructing the table that maps context id -> egress queue.
MFC after: 1 day
|
219436 |
09-Mar-2011 |
np |
Display holdoff timers and packet counts as a list of numbers.
MFC after: 1 week
|
219392 |
08-Mar-2011 |
np |
cxgbe shouldn't directly know of the UMA zones where network buffers come from.
MFC after: 1 week
|
219299 |
05-Mar-2011 |
np |
Be sure to stay within the bounds of the mod_str array when displaying the transceiver type.
|
219293 |
05-Mar-2011 |
np |
There is no need to hold an ingress queue's lock while processing its descriptors.
MFC after: 1 week
|
219292 |
05-Mar-2011 |
np |
Calculate how many descriptors can be reclaimed before calling reclaim_tx_descs
|
219290 |
05-Mar-2011 |
np |
Tweaks for rx:
- everything related to LRO should be in #ifdef INET blocks - reorder sge_iq's fields so that the most frequently used are all together - pull all rx code into t4_intr_data directly - let go of the ingress queue lock when passing up data - refill the freelist only if it is short of at least 32 buffers
|
219289 |
05-Mar-2011 |
np |
Store the ifnet rather than the port_info in each txq and rxq struct.
MFC after: 1 week
|
219288 |
05-Mar-2011 |
np |
A txpkts work request should have a valid FID.
MFC after: 1 week
|
219287 |
05-Mar-2011 |
np |
Upgrade the firmware on the card automatically if a better version is available. Downgrade only for a major version mismatch.
MFC after: 1 week
|
219286 |
05-Mar-2011 |
np |
Resume tx immediately in response to an SGE egress update from the hardware.
MFC after: 1 week
|
219285 |
05-Mar-2011 |
np |
Fix incorrect assertion.
MFC after: 3 days
|
218792 |
18-Feb-2011 |
np |
cxgbe(4) - NIC driver for Chelsio T4 (Terminator 4) based 10Gb/1Gb adapters.
MFC after: 3 weeks
|