#
355242 |
|
30-Nov-2019 |
np |
MFC r349500:
cxgbe/t4_tom: Fix regression in t_maxseg usage within t4_tom.
t_maxseg was changed in r293284 to not have any adjustment for TCP timestamps. t4_tom inadvertently went back to pre-r293284 semantics in r332506.
Sponsored by: Chelsio Communications
|
#
351236 |
|
19-Aug-2019 |
jhb |
MFC 349467: Hold an explicit reference on the socket for the aiotx task.
Previously, the aiotx task relied on the aio jobs in the queue to hold a reference on the socket. However, when the last job is completed, there is nothing left to hold a reference to the socket buffer lock used to check if the queue is empty. In addition, if the last job on the queue is cancelled, the task can run with no queued jobs holding a reference to the socket buffer lock the task uses to notice the queue is empty.
Fix these races by holding an explicit reference on the socket when the task is queued and dropping that reference when the task completes.
|
#
348704 |
|
05-Jun-2019 |
np |
MFC r348491:
cxgbe/t4_tom: adjust the hardware receive window to match changes to the receive sockbuf's high water mark.
Calculate rx credits on the spot instead of tracking sbused/sb_cc and rx_credits in the toepcb. The previous method worked when the high water mark changed due to SB_AUTOSIZE but not when it was adjusted directly (for example, by the soreserve in nfsrvd_addsock).
This fixes a connection hang while running iozone over an NFS mounted share where nfsd's TCP sockets are being handled by t4_tom.
Sponsored by: Chelsio Communications
Approved by: re@ (gjb@)
|
#
346970 |
|
30-Apr-2019 |
np |
MFC r342208:
cxgbe/t4_tom: fixes for issues on the passive open side.
- Fix PR 227760 by getting the TOE to respond to the SYN after the call to toe_syncache_add, not during it. The kernel syncache code calls syncache_respond just before syncache_insert. If the ACK to the syncache_respond is processed in another thread it may run before the syncache_insert and won't find the entry. Note that this affects only t4_tom because it's the only driver trying to insert and expand syncache entries from different threads.
- Do not leak resources if an embryonic connection terminates at SYN_RCVD because of L2 lookup failures.
- Retire lctx->synq and associated code because there is never a need to walk the list of embryonic connections associated with a listener. The per-tid state is still called a synq entry in the driver even though the synq itself is now gone.
PR: 227760 Sponsored by: Chelsio Communications
|
#
346947 |
|
30-Apr-2019 |
np |
MFC r342356:
Remove unused macros from t4_tom.h.
|
#
346934 |
|
29-Apr-2019 |
np |
MFC r341172, r341270. t4_clip.c had to be manually adjusted because Concurrency Kit is not available in stable/11.
r341172: Move CLIP table handling out of TOM and into the base driver.
- Store the clip table in 'struct adapter' instead of in the TOM softc. - Init the clip table during attach and teardown during detach. - While here, add a dev.<nexus>.<unit>.misc.clip sysctl to dump the CLIP table.
This does mean that we update the clip table even if TOE is not enabled, but non-TOE things need the CLIP table anyway.
Reviewed by: np, Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D18010
r341270: Make most of the CLIP code conditional on #ifdef INET6.
This fixes builds of kernels without INET6 such as LINT-NOINET6.
Reported by: arybchik Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D18384
|
#
346852 |
|
28-Apr-2019 |
np |
MFC r333114:
cxgbe(4): Use opaque cookies or tid range-checks to determine the intended recipient of a CPL when it can't be determined solely from the opcode. Retire the per-queue handlers for such CPLs in favor of the new scheme.
Sponsored by: Chelsio Communications
|
#
346850 |
|
28-Apr-2019 |
np |
MFC r333043:
cxgbe(4): Move release_tid to the base NIC driver for future consumers.
Sponsored by: Chelsio Communications.
|
#
346805 |
|
28-Apr-2019 |
np |
MFC r317849 (partial), r332506, and r332787.
r317849 (partial, required by r332506): cxgbe/t4_tom: Per-connection rate limiting for TCP sockets handled by the TOE.
Sponsored by: Chelsio Communications
r332506: cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection using t4_tom, and what settings to apply to a connection selected for offload. t4_tom must still be loaded and IFCAP_TOE must still be enabled for full TCP offload to take place on an interface. The difference is that IFCAP_TOE used to be the only knob and would enable TOE for all new connections on the inteface, but now the driver will also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line. Each rule has a "match" part consisting of a socket-type (L = listen, A = active open, P = passive open, D = don't care) and a pcap-filter(7) expression, and a "settings" part that specifies whether to offload the connection or not and the parameters to use if so. The general format of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information. [L] ip && port http => offload [L] port 443 => !offload [L] port ssh => offload [P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno [P] dst port ssh => offload !nagle ecn cong tahoe [P] dst port http => offload [A] dst port 443 => offload tls [A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or passive open and stops at the first match. There is an implicit rule at the end of every policy that prohibits offload when no rule in the policy matches: [D] all => !offload
This is a reworked and expanded version of a patch submitted by Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
r332787: cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag (instead of hardcoded 0xffff, implying no tag) in the routines that process offload policy.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
|
#
345664 |
|
28-Mar-2019 |
jhb |
MFC 330040,330041,330079,330884,330946,330947,331649,333068,333810,337722, 340466,340468,340469,340473: Add TOE-based TLS offload.
Note that this requires a modified OpenSSL library.
330040: Fetch TLS key parameters from the firmware.
The parameters describe how much of the adapter's memory is reserved for storing TLS keys. The 'meminfo' sysctl now lists this region of adapter memory as 'TLS keys' if present.
330041: Move ccr_aes_getdeckey() from ccr(4) to the cxgbe(4) driver.
This routine will also be used by the TOE module to manage TLS keys.
330079: Move #include for rijndael.h out of x86-specific region.
The #include was added inside of the conditional by accident and the lack of it broke non-x86 builds.
330884: Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS encryption and TCP segmentation for offloaded connections. Sockets using TLS are required to use a set of custom socket options to upload RX and TX keys to the NIC and to enable RX processing. Currently these socket options are implemented as TCP options in the vendor specific range. A patched OpenSSL library will be made available in a port / package for use with the TLS TOE support.
TOE sockets can either offload both transmit and reception of TLS records or just transmit. TLS offload (both RX and TX) is enabled by setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be enabled on the relevant interface. Transmit offload can be used on any "normal" or TLS TOE socket by using the custom socket option to program a transmit key. This permits most TOE sockets to transparently offload TLS when applications use a patched SSL library (e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL library). Receive offload can only be used with TOE sockets using the TLS mode. The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a list of TCP port numbers. Any connection with either a local or remote port number in that list will be created as a TLS socket rather than a plain TOE socket. Note that although this sysctl accepts an arbitrary list of port numbers, the sysctl(8) tool is only able to set sysctl nodes to a single value. A TLS socket will hang without receiving data if used by an application that is not using a patched SSL library. Thus, the tls_rx_ports node should be used with care. For a server mostly concerned with offloading TLS transmit, this node is not needed as plain TOE sockets will fall back to software crypto when using an unpatched SSL library.
New per-interface statistics nodes are added giving counts of TLS packets and payload bytes (payload bytes do not include TLS headers or authentication tags/MACs) offloaded via the TOE engine, e.g.:
dev.cc.0.stats.rx_tls_octets: 149 dev.cc.0.stats.rx_tls_records: 13 dev.cc.0.stats.tx_tls_octets: 26501823 dev.cc.0.stats.tx_tls_records: 1620
TLS transmit work requests are constructed by a new variant of t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.
TLS transmit work requests require a buffer containing IVs. If the IVs are too large to fit into the work request, a separate buffer is allocated when constructing a work request. This buffer is associated with the transmit descriptor and freed when the descriptor is ACKed by the adapter.
Received TLS frames use two new CPL messages. The first message is a CPL_TLS_DATA containing the decryped payload of a single TLS record. The handler places the mbuf containing the received payload on an mbufq in the TOE pcb. The second message is a CPL_RX_TLS_CMP message which includes a copy of the TLS header and indicates if there were any errors. The handler for this message places the TLS header into the socket buffer followed by the saved mbuf with the payload data. Both of these handlers are contained in tom/t4_tls.c.
A few routines were exposed from t4_cpl_io.c for use by t4_tls.c including send_rx_credits(), a new send_rx_modulate(), and t4_close_conn().
TLS keys for both transmit and receive are stored in onboard memory in the NIC in the "TLS keys" memory region.
In some cases a TLS socket can hang with pending data available in the NIC that is not delivered to the host. As a workaround, TLS sockets are more aggressive about sending CPL_RX_DATA_ACK messages anytime that any data is read from a TLS socket. In addition, a fallback timer will periodically send CPL_RX_DATA_ACK messages to the NIC for connections that are still in the handshake phase. Once the connection has finished the handshake and programmed RX keys via the socket option, the timer is stopped.
A new function select_ulp_mode() is used to determine what sub-mode a given TOE socket should use (plain TOE, DDP, or TLS). The existing set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and handles initialization of TLS-specific state when necessary in addition to DDP-specific state.
Since TLS sockets do not receive individual TCP segments but always receive full TLS records, they can receive more data than is available in the current window (e.g. if a 16k TLS record is received but the socket buffer is itself 16k). To cope with this, just drop the window to 0 when this happens, but track the overage and "eat" the overage as it is read from the socket buffer not opening the window (or adding rx_credits) for the overage bytes.
330946: Remove TLS-related inlines from t4_tom.h to fix iw_cxgbe(4) build.
- Remove the one use of is_tls_offload() and the function. AIO special handling only needs to be disabled when a TOE socket is actively doing TLS offload on transmit. The TOE socket's mode (which affects receive operation) doesn't matter, so remove the check for the socket's mode and only check if a TOE socket has TLS transmit keys configured to determine if an AIO write request should fall back to the normal socket handling instead of the TOE fast path. - Move can_tls_offload() into t4_tls.c. It is not used in critical paths, so inlining isn't that important. Change return type to bool while here.
330947: Fix the check for an empty send socket buffer on a TOE TLS socket.
Compare sbavail() with the cached sb_off of already-sent data instead of always comparing with zero. This will correctly close the connection and send the FIN if the socket buffer contains some previously-sent data but no unsent data.
331649: Use the offload transmit queue to set flags on TLS connections.
Requests to modify the state of TLS connections need to be sent on the same queue as TLS record transmit requests to ensure ordering.
However, in order to use the offload transmit queue in t4_set_tcb_field(), the function needs to be updated to do proper flow control / credit management when queueing a request to an offload queue. This required passing a pointer to the toepcb itself to this function, so while here remove the 'tid' and 'iqid' parameters and obtain those values from the toepcb in t4_set_tcb_field() itself.
333068: Use the correct key address when renegotiating the transmit key.
Previously, get_keyid() was returning the address of the receive key instead of the transmit key when renegotiating the transmit key. This could either hang the card (if a connection was only offloading TLS TX and thus had a receive key address of -1) or cause the connection to fail by overwriting the wrong key (if both RX and TX TLS were offloaded).
333810: Be more robust against garbage input on a TOE TLS TX socket.
If a socket is closed or shutdown and a partial record (or what appears to be a partial record) is waiting in the socket buffer, discard the partial record and close the connection rather than waiting forever for the rest of the record.
337722: Whitespace nit in t4_tom.h
340466: Move the TLS key map into the adapter softc so non-TOE code can use it.
340468: Change the quantum for TLS key addresses to 32 bytes.
The addresses passed when reading and writing keys are always shifted right by 5 as the memory locations are addressed in 32-byte chunks, so the quantum needs to be 32, not 8.
340469: Remove bogus roundup2() of the key programming work request header.
The key context is always placed immediately after the work request header. The total work request length has to be rounded up by 16 however.
340473: Restore the <sys/vmem.h> header to fix build of cxgbe(4) TOM.
vmem's are not just used for TLS memory in TOM and the #include actually predates the TLS code so should not have been removed when the TLS vmem moved in r340466.
Sponsored by: Chelsio Communications
|
#
344856 |
|
06-Mar-2019 |
jhb |
MFC 330882: Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning success. This avoids having to pretend to have proper support for unloading when only part of t4_tom_mod_load() has run. - If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly. The module handling code in the kernel invokes MOD_UNLOAD on a module whose MOD_LOAD fails with an error already.
|
#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
331645 |
|
27-Mar-2018 |
jhb |
MFC 329785: Move DDP PCB state into a helper structure.
This consolidates all of the DDP state in one place. Also, the code has now been fixed to ensure that DDP state is only accessed for DDP connections. This should not be a functional change but makes it cleaner and easier to add state for other TOE socket modes in the future.
Sponsored by: Chelsio Communications
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
323884 |
|
21-Sep-2017 |
jhb |
MFC 323630: Avoid reusing the wrong buffer for a DDP AIO request.
To optimize the case of ping-ponging between two buffers, the DDP code caches the last two buffers used keeping the pages wired and page pods stored in the NIC's RAM. If a new aio_read() request uses one of the same buffers, then the work of holding pages, etc. can be avoided. However, the starting virtual address of an aio buffer was not saved, only the page count, length, and initial page offset. Thus, an aio_read() request could match a different buffer in the address space. (Earlier during development vm_fault_hold_quick_pages() was always called and the vm_page_t values were compared, but that was eventually removed without being adequately replaced.) Fix by storing the starting virtual address and comparing that (along with other fields) to determine if a buffer can be reused.
Sponsored by: Chelsio Communications
|
#
318803 |
|
24-May-2017 |
np |
MFC r313346:
cxgbe/t4_tom: Fix CLIP entry refcounting on the passive side. Every IPv6 connection being handled by the TOE should have a reference on its CLIP entry.
Sponsored by: Chelsio Communications
|
#
313178 |
|
03-Feb-2017 |
jhb |
MFC 312906: Unregister CPL handlers for TOE-related messages when unloading TOM.
Sponsored by: Chelsio Communications
|
#
312116 |
|
14-Jan-2017 |
np |
MFC r311569, r311657, and r311949.
r311569: Fix comment in t4_tom. No functional change.
r311657: cxgbe/t4_tom: Fix tid accounting. An offloaded IPv6 connection uses 2 tids, not 1, in the hardware.
r311949: cxgbe/tom: Add VIMAGE support to the TOE driver.
Active Open: - Save the socket's vnet at the time of the active open (t4_connect) and switch to it when processing the reply (do_act_open_rpl or do_act_establish).
Passive Open: - Save the listening socket's vnet in the driver's listen_ctx and switch to it when processing incoming SYNs for the socket. - Reject SYNs that arrive on an ifnet that's not in the same vnet as the listening socket.
CLIP (Compressed Local IPv6) table: - Add only those IPv6 addresses to the CLIP that are in a vnet associated with one of the card's ifnets.
Misc: - Set vnet from the toepcb when processing TCP state transitions. - The kernel sets the vnet when calling the driver's output routine so t4_push_frames runs in proper vnet context already. One exception is when incoming credits trigger tx within the driver's ithread. Set the vnet explicitly in do_fw4_ack for that case.
Sponsored by: Chelsio Communications
|
#
309555 |
|
05-Dec-2016 |
jhb |
MFC 303688,303750,305166,305167: Centralize and rework page pod handling.
303688: cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a per-adapter data structure. This replaces a global array with hardcoded page sizes.
303750: cxgbe/t4_tom: The page pod arena allocates from pod address space and not index space. The minimum valid allocation out of this arena is the size of a single page pod.
305166: cxgbe/t4_tom: Add general purpose routines to deal with page pod regions and allocations within them. Switch to these routines to manage the TOE DDP region.
305167: cxgbe/t4_tom: Two new routines to allocate and write page pods for a buffer in the kernel's address space.
Sponsored by: Chelsio Communications
|
#
306661 |
|
03-Oct-2016 |
jhb |
MFC 303405: Add support for zero-copy aio_write() on TOE sockets.
AIO write requests for a TOE socket on a Chelsio T4+ adapter can now DMA directly from the user-supplied buffer. This is implemented by wiring the pages backing the user-supplied buffer and queueing special mbufs backed by raw VM pages to the socket buffer. The TOE code recognizes these special mbufs and builds a sglist from the VM page array associated with the mbuf when queueing a work request to the TOE.
Because these mbufs do not have an associated virtual address, m_data is not valid. Thus, the AIO handler does not invoke sosend() directly for these mbufs but instead inlines portions of sosend_generic() and tcp_usr_send().
An aiotx_buffer structure is used to describe the user buffer (e.g. it holds the array of VM pages and a reference to the AIO job). The special mbufs reference this structure via m_ext. Note that a single job might be split across multiple mbufs (e.g. if it is larger than the socket buffer size). The 'ext_arg2' member of each mbuf gives an offset relative to the backing aiotx_buffer. The AIO job associated with an aiotx_buffer structure is completed when the last reference to the structure is released.
Zero-copy aio_write()'s for connections associated with a given adapter can be enabled/disabled at runtime via the 'dev.t[45]nex.N.toe.tx_zcopy' sysctl.
Sponsored by: Chelsio Communications
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
302339 |
|
04-Jul-2016 |
np |
cxgbe(4): Changes to the CPL-handler registration mechanism and code related to "shared" CPLs.
a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single function. Allow callers to direct the response to any iq. Tidy up set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of magic constants.
b) Remove all CPL handler tables from struct adapter. This reduces its size by around 2KB. All handlers are now registered at MOD_LOAD instead of attach or some kind of initialization/activation. The registration functions do not need an adapter parameter any more.
c) Add per-iq handlers to deal with CPLs whose destination cannot be determined solely from the opcode. There are 2 such CPLs in use right now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq. t4_tom (including the DDP code) now uses the port's ctrlq to send L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq. fwq and ofld_rxq have different handlers that know what kind of tid to expect in the reply. Update t4_write_l2e and callers to to support any wrq/iq combination.
Approved by: re@ (kib@) Sponsored by: Chelsio Communications
|
#
299210 |
|
06-May-2016 |
jhb |
Use DDP to implement zerocopy TCP receive with aio_read().
Chelsio's TCP offload engine supports direct DMA of received TCP payload into wired user buffers. This feature is known as Direct-Data Placement. However, to scale well the adapter needs to prepare buffers for DDP before data arrives. aio_read() is more amenable to this requirement than read() as applications often call read() only after data is available in the socket buffer.
When DDP is enabled, TOE sockets use the recently added pru_aio_queue protocol hook to claim aio_read(2) requests instead of letting them use the default AIO socket logic. The DDP feature supports scheduling DMA to two buffers at a time so that the second buffer is ready for use after the first buffer is filled. The aio/DDP code optimizes the case of an application ping-ponging between two buffers (similar to the zero-copy bpf(4) code) by keeping the two most recently used AIO buffers wired. If a buffer is reused, the aio/DDP code is able to reuse the vm_page_t array as well as page pod mappings (a kind of MMU mapping the Chelsio NIC uses to describe user buffers). The generation of the vmspace of the calling process is used in conjunction with the user buffer's address and length to determine if a user buffer matches a previously used buffer. If an application queues a buffer for AIO that does not match a previously used buffer then the least recently used buffer is unwired before the new buffer is wired. This ensures that no more than two user buffers per socket are ever wired.
Note that this feature is best suited to applications sending a steady stream of data vs short bursts of traffic.
Discussed with: np Relnotes: yes Sponsored by: Chelsio Communications
|
#
292736 |
|
25-Dec-2015 |
np |
cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI offload driver. These changes come from projects/cxl_iscsi.
|
#
291665 |
|
02-Dec-2015 |
jhb |
Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics. The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented as additional VIs on each port. This change allows additional non-netmap interfaces to be configured on each port. Additional virtual interfaces use the naming scheme vcxgbeX or vcxlX.
Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a value greater than 1 before loading the cxgbe(4) or cxl(4) driver. NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).
T4/T5 NICs provide a limited number of MAC addresses for each physical port. As a result, a maximum of six VIs can be configured on each port (including the "main" interface and the netmap interface when netmap is enabled).
One user-visible result is that when netmap is enabled, packets received or transmitted via the netmap interface are no longer counted in the stats for the "main" interface, but are not accounted to the netmap interface.
The netmap interfaces now also have a new-bus device and export various information sysctl nodes via dev.n(cxgbe|cxl).X.
The cxgbetool 'clearstats' command clears the stats for all VIs on the specified port along with the port's stats. There is currently no way to clear the stats of an individual VI.
Reviewed by: np MFC after: 1 month Sponsored by: Chelsio
|
#
280146 |
|
16-Mar-2015 |
jhb |
Move special DDP handling for closing a connection into a new handle_ddp_close() function in t4_ddp.c as the logic is similar to handle_ddp_data(). This allows all knowledge of the special DDP mbufs to be private to t4_ddp.c as well.
|
#
276775 |
|
07-Jan-2015 |
np |
cxgbe/tom: allocate page pod addresses instead of ppod#.
MFC after: 2 weeks
|
#
276729 |
|
05-Jan-2015 |
np |
cxgbe/tom: use vmem(9) as the DDP page pod allocator.
MFC after: 1 month
|
#
275733 |
|
12-Dec-2014 |
np |
Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe code can use it too.
MFC after: 1 week
|
#
272719 |
|
07-Oct-2014 |
np |
cxgbe/tom: don't leak resources tied to an active open request that cannot be sent to the chip because a prerequisite L2 resolution failed.
Submitted by: Hariprasad at chelsio dot com (original version) MFC after: 2 weeks.
|
#
269076 |
|
24-Jul-2014 |
np |
Some hooks in cxgbe(4) for the offloaded iSCSI driver.
(I'm committing this on behalf of my colleagues in the Storage team at Chelsio).
Submitted by: Sreenivasa Honnur <shonnur at chelsio dot com> Sponsored by: Chelsio Communications.
|
#
259527 |
|
17-Dec-2013 |
np |
Do not create a hardware IPv6 server if the listen address is not in6addr_any and is not in the CLIP table either. This fixes a reported TOE+IPv6 NULL-dereference panic in do_pass_open_rpl().
While here, stop creating hardware servers for any loopback address. It's just a waste of server tids.
MFC after: 1 week
|
#
255411 |
|
09-Sep-2013 |
np |
Rework the tx credit mechanism between the cxgbe/tom driver and the card. This helps smooth out some burstiness in the exchange.
Approved by: re (glebius)
|
#
252705 |
|
04-Jul-2013 |
np |
- Read all TP parameters in one place. - Read the filter mode, calculate various shifts, and use them properly during active open (in select_ntuple).
MFC after: 1 day
|
#
251638 |
|
11-Jun-2013 |
np |
cxgbe/tom: Allow caller to select the queue (control or data) used to send the CPL_SET_TCB_FIELD request in t4_set_tcb_field().
MFC after: 1 week
|
#
250218 |
|
03-May-2013 |
np |
cxgbe/tom: Do not use M_PROTO1 to mark rx zero-copy mbufs as special. All the M_PROTOn flags are clobbered when an mbuf is appended to the socket buffer.
MFC after: 1 week
|
#
249627 |
|
18-Apr-2013 |
np |
cxgbe/tom: Update the CLIP table on the chip when there are changes to the list of IPv6 addresses on the system. The table is used for TOE+IPv6 only.
|
#
248925 |
|
30-Mar-2013 |
np |
cxgbe(4): Add support for Chelsio's Terminator 5 (aka T5) ASIC. This includes support for the NIC and TOE features of the 40G, 10G, and 1G/100M cards based on the T5.
The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4) has been updated instead of writing a brand new driver. T5 cards will show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.
Sponsored by: Chelsio
|
#
245935 |
|
26-Jan-2013 |
np |
Add a couple of missing error codes. Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative advice and not a fatal error.
MFC after: 3 days
|
#
245448 |
|
15-Jan-2013 |
np |
cxgbe/tom: Basic CLIP table management.
This is the Compressed Local IPv6 table on the chip. To save space, the chip uses an index into this table instead of a full IPv6 address in some of its hardware data structures.
For now the driver fills this table with all the local IPv6 addresses that it sees at the time the table is initialized. I'll improve this later so that the table is updated whenever new IPv6 addresses are configured or existing ones deleted.
MFC after: 1 week
|
#
245441 |
|
14-Jan-2013 |
np |
cxgbe/tom: Miscellaneous updates for TOE+IPv6 support (more to follow).
- Teach find_best_mtu_idx() to deal with IPv6 endpoints.
- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is enabled.
- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included without having to drag in t4_msg.h too. This was bothering the iWARP driver for some reason.
MFC after: 1 week
|
#
245276 |
|
10-Jan-2013 |
np |
Overhaul the stid allocator so that it can be used for IPv6 servers too. The entry for an IPv6 server in the TCAM takes up the equivalent of two ordinary stids and must be properly aligned too.
MFC after: 1 week
|
#
243681 |
|
29-Nov-2012 |
np |
cxgbe/tom: Handle the case where the chip falls out of DDP mode by itself. The hole in the receive sequence space corresponds to the number of bytes placed directly up to that point.
MFC after: 1 week
|
#
243680 |
|
29-Nov-2012 |
np |
cxgbe/tom: Add a flag to indicate that the L2 table entry for an embryonic connection has been setup and never attempt to abort a tid before this is done. This fixes a bad race where a listening socket is closed when the driver is in the middle of step (b) here. The symptom of this were "ARP miss" errors from the driver followed by tid leaks.
A hardware-offloaded passive open works this way:
a) A SYN "hits" the TCAM entry for a server tid and the chip delivers it to the queue associated with the server tid (say, queue A). It waits for a response from the driver telling it what to do.
b) The driver decides it is ok to proceed. It adds the new tid to the list of embryonic connections associated with the server tid and then hands off the SYN to the kernel's syncache to make sure that the kernel okays it too. If it does then the driver provides an L2 table entry, queue id (say, queue B), etc. and instructs the chip to send the SYN/ACK response.
c) The chip delivers a status to queue B depending on how the third step of the 3-way handshake goes. The driver removes the tid from its list of embryonic connections and either expands the syncache entry or destroys the tid. In any case all subsequent messages for the new tid will be delivered to queue B, not queue A. Anything running in queue B knows that the L2 entry has long been setup and the new flag is of no interest from here on. If the listener is closed it will deal with so_comp as normal.
MFC after: 1 week
|
#
241733 |
|
19-Oct-2012 |
ed |
Prefer __containerof() over __member2struct().
The former works better with qualifiers, but also properly type checks the input pointer.
|
#
239544 |
|
21-Aug-2012 |
np |
Deal with the case where a syncache entry added by the TOE driver is evicted from the syncache but a later syncache_expand succeeds because of syncookies. The TOE driver has to resort to more direct means to install its hooks in the socket in this case.
|
#
239514 |
|
21-Aug-2012 |
np |
Minor cleanup: use bitwise ops instead of pointless wrappers around setbit/clrbit.
|
#
239344 |
|
16-Aug-2012 |
np |
Support for TCP DDP (Direct Data Placement) in the T4 TOE module.
Basically, this is automatic rx zero copy when feasible. TCP payload is DMA'd directly into the userspace buffer described by the uio submitted in soreceive by an application.
- Works with sockets that are being handled by the TCP offload engine of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an "ifconfig +toe" on the cxgbe interface). - Does not require any modification to the application. - Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.
|
#
239338 |
|
16-Aug-2012 |
np |
Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware TCB. Filters are programmed by modifying the TCB too (via a different routine) and the reply to any TCB update is delivered via a CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or something else and route it appropriately.
MFC after: 2 weeks
|
#
237263 |
|
19-Jun-2012 |
np |
- Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features.
- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon.
Build-tested with make universe.
30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m | grep TOE
Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe
Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp | grep toe # sockstat -46c | grep toe
Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
|