History log of /freebsd-current/sys/dev/mlx5/mlx5_en/mlx5_en_hw_tls.c
Revision Date Author Comments
# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 5dc00f00 19-Sep-2022 Justin Hibbits <jhibbits@FreeBSD.org>

Mechanically convert mlx5en(4) to IfAPI

Reviewed by: zlei
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38595


# 015f22f5 16-Feb-2022 Hans Petter Selasky <hselasky@FreeBSD.org>

mlx5en: Fix TLS worker thread race.

Create a dedicated free state, in case the taskqueue worker is still pending,
to avoid re-activation of a freed send tag.

MFC after: 1 week
Sponsored by: NVIDIA Networking


# ebdb7006 16-Feb-2022 Hans Petter Selasky <hselasky@FreeBSD.org>

mlx5en: Improve RX- and TX- TLS refcounting.

Use the send tag refcounting mechanism to refcount the RX- and TX- TLS
send tags. Then it is no longer needed to wait for refcounts to reach
zero when destroying RX- and TX- TLS send tags as a result of pending
data or WQE commands.

This also ensures that when TX-TLS and rate limiting is used at the same
time, the underlying SQ is not prematurely destroyed.

MFC after: 1 week
Sponsored by: NVIDIA Networking


# 235ed6a4 15-Feb-2022 Mark Johnston <markj@FreeBSD.org>

mlx5e: Make TLS tag zones unmanaged

These zones are cache zones used to allocate TLS offload contexts from
firmware. Releasing items from the cache is a sleepable operation due
to the need to await a response from the firmware command freeing the
tag, so items cannot be reclaimed from the zone in non-sleepable
contexts. Since the cache size is limited by firmware limits, avoid
this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout()
and the low memory handler.

Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142


# 75767cb8 01-Feb-2022 Hans Petter Selasky <hselasky@FreeBSD.org>

mlx5en: Share DEK objects with TLS RX.

The TLS RX support also needs to be able to allocate DEK objects.
Share the available objects 1:1.

MFC after: 1 week
Sponsored by: NVIDIA Networking


# 0f7b6e11 15-Nov-2021 Konstantin Belousov <konstantinb@nvidia.com>

mlx5en: Use a UMA cache zone for managing TLS send tags

Instead of allocating directly from a normal zone. This way
import and release are guaranteed to process all allocated and then
deallocated items. Also, the release occurs in a sleepable context when
caller of uma_zfree() or uma_zdestroy() can sleep itself.

MFC after: 1 week
Sponsored by: NVIDIA Networking


# 89918a23 14-Jun-2021 Konstantin Belousov <konstantinb@nvidia.com>

mlx5en: idiomatic use of preprocessor, in particular paths

MFC after: 1 week
Sponsored by: NVIDIA Networking


# b984b956 14-Jun-2021 Konstantin Belousov <konstantinb@nvidia.com>

mlx5en: normalize use of the opt_*.h files

MFC after: 1 week
Sponsored by: NVIDIA Networking


# c782ea8b 14-Sep-2021 John Baldwin <jhb@FreeBSD.org>

Add a switch structure for send tags.

Move the type and function pointers for operations on existing send
tags (modify, query, next, free) out of 'struct ifnet' and into a new
'struct if_snd_tag_sw'. A pointer to this structure is added to the
generic part of send tags and is initialized by m_snd_tag_init()
(which now accepts a switch structure as a new argument in place of
the type).

Previously, device driver ifnet methods switched on the type to call
type-specific functions. Now, those type-specific functions are saved
in the switch structure and invoked directly. In addition, this more
gracefully permits multiple implementations of the same tag within a
driver. In particular, NIC TLS for future Chelsio adapters will use a
different implementation than the existing NIC TLS support for T6
adapters.

Reviewed by: gallatin, hselasky, kib (older version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31572


# 3a934ba7 16-Jun-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

mlx5en: Wait for all TLS connections to terminate when unloading driver.

The driver expects all TLS tags to be returned to the driver before
it can free the UMA zone where the TLS tags reside.

MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking


# caf43971 24-Nov-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Remove erradic assert after SVN r367149 in mlx5en(4).

The ratelimit tags may be shared, especially for unlimited TLS
traffic, and then the refcount is allowed to be greater than one
when freeing the send tag.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# b7d92a66 30-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Remove IF_SND_TAG_TYPE_TLS_RATE_LIMIT conditionals.

Support for TLS rate limit tags is now in the tree, so this macro is
always defined.

Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27020


# 418b5444 29-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Fix a couple of silly bugs in r367149.

- Assign the TLS rate limit value to the correct member of the
rl_params for the nested rate limit tag.

- Remove a dead condition.

Pointy hat to: jhb


# 36e0a362 29-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().

This gives a more uniform API for send tag life cycle management.

Reviewed by: gallatin, hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27000


# 638000c0 29-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Use public interfaces to manage the nested rate limit send tag.

Each TLS send tag in mlx5 contains a nested rate limit send tag.
Previously, the driver was calling internal functions to manage the
nested tag. Calling free methods directly instead of m_snd_tag_rele()
leaked send tag references and references on the ifp. Changes to use
the ifp methods for the nested tag for other methods are more cosmetic
but do simplify the code.

Reviewed by: gallatin, hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26996


# 56fb710f 06-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Store the send tag type in the common send tag header.

Both cxgbe(4) and mlx5(4) wrapped the existing send tag header with
their own identical headers that stored the type that the
type-specific tag structures inherited from, so in practice it seems
drivers need this in the tag anyway. This permits removing these
extra header indirections (struct cxgbe_snd_tag and struct
mlx5e_snd_tag).

In addition, this permits driver-independent code to query the type of
a tag, e.g. to know what type of tag is being queried via
if_snd_query.

Reviewed by: gallatin, hselasky, np, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26689


# 78ae1e6e 31-Aug-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Make hardware TLS send tag allocation synchronous in mlx5en(4).

Previously the send tag was setup in the background, and all packets for
the given send tag were dropped until ready. Change this to be blocking
behaviour so that once the setsocketopt() for enabling TLS completes,
the socket is ready to send packets. Do this by simply flushing the
work request which does the needed firmware programming during send
tag allocation.

MFC after: 1 week
Sponsored by: Mellanox Technologies // Nvidia


# 11304ef5 17-Jun-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix HW TLS offload regression issue after r359919, in mlx5en(4).

Changes in the mbuf layout regarding HW TLS, resulted in wrong detection
of starting mbuf. Use a boolean variable to handle this and pass m_adj()
the top mbuf, so that the packet header is adjusted correctly.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 9eb1e4aa 11-Jun-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Use const keyword when parsing the TCP/IP header in the fast path in mlx5en(4).

When parsing the TCP/IP header in the fast path, make it clear by using
the const keyword, no fields are to be modified inside the transmitted
packet.

No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# ce69b842 24-May-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve set progress parameters, SET PSV for HW TLS in mlx5en(4).

There is no need for a fence and there is no need to provide
the TCP sequence number.

Sponsored by: Mellanox Technologies


# 233a6665 24-May-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Correctly set the initial vector for TLS v1.3 for mlx5en(4).

For TLS v1.3 the 12 bytes of the initial vector, IV, should just be copied
as-is from the kernel to the gcm_iv field, which hold the first 4 bytes,
and the remaining 8 bytes go to the subsequent implicit_iv field.
There is no need to consider the byte order on the 12 bytes of IV like
initially done.

Sponsored by: Mellanox Technologies


# 9550e340 24-May-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Update the TLS capability bit after recent PRM changes in mlx5en(4).

A CX6-DX firmware version equal to or newer than 12.27.0372 is
now required.

Sponsored by: Mellanox Technologies


# 6edfd179 02-May-2020 Gleb Smirnoff <glebius@FreeBSD.org>

Step 4.1: mechanically rename M_NOMAP to M_EXTPG

Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598


# 7b6c99d0 02-May-2020 Gleb Smirnoff <glebius@FreeBSD.org>

Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf
within m_epg namespace.
All edits except the 'struct mbuf' declaration and mb_dupcl() were done
mechanically with sed:

s/->m_ext_pgs.nrdy/->m_epg_nrdy/g
s/->m_ext_pgs.hdr_len/->m_epg_hdrlen/g
s/->m_ext_pgs.trail_len/->m_epg_trllen/g
s/->m_ext_pgs.first_pg_off/->m_epg_1st_off/g
s/->m_ext_pgs.last_pg_len/->m_epg_last_len/g
s/->m_ext_pgs.flags/->m_epg_flags/g
s/->m_ext_pgs.record_type/->m_epg_record_type/g
s/->m_ext_pgs.enc_cnt/->m_epg_enc_cnt/g
s/->m_ext_pgs.tls/->m_epg_tls/g
s/->m_ext_pgs.so/->m_epg_so/g
s/->m_ext_pgs.seqno/->m_epg_seqno/g
s/->m_ext_pgs.stailq/->m_epg_stailq/g

Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598


# 6fbcdeb6 02-May-2020 Gleb Smirnoff <glebius@FreeBSD.org>

Step 2.4: Stop using 'struct mbuf_ext_pgs' in drivers.

Reviewed by: gallatin, hselasky
Differential Revision: https://reviews.freebsd.org/D24598


# 23feb563 14-Apr-2020 Andrew Gallatin <gallatin@FreeBSD.org>

KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself.

While the original implementation of unmapped mbufs was a large
step forward in terms of reducing cache misses by enabling mbufs
to carry more than a single page for sendfile, they are rather
cache unfriendly when accessing the ext_pgs metadata and
data. This is because the ext_pgs part of the mbuf is allocated
separately, and almost guaranteed to be cold in cache.

This change takes advantage of the fact that unmapped mbufs
are never used at the same time as pkthdr mbufs. Given this
fact, we can overlap the ext_pgs metadata with the mbuf
pkthdr, and carry the ext_pgs meta directly in the mbuf itself.
Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array
of pages) directly after the existing m_ext.

In order to be able to carry 5 pages (which is the minimum
required for a 16K TLS record which is not perfectly aligned) on
LP64, I've had to steal ext_arg2. The only user of this in the
xmit path is sendfile, and I've adjusted it to use arg1 when
using unmapped mbufs.

This change is almost entirely mechanical, except that we
change mb_alloc_ext_pgs() to no longer allow allocating
pkthdrs, the change to avoid ext_arg2 as mentioned above,
and the removal of the ext_pgs zone,

This change saves roughly 2% "raw" CPU (~59% -> 57%), or over
3% "scaled" CPU on a Netflix 100% software kTLS workload at
90+ Gb/s on Broadwell Xeons.

In a follow-on commit, I plan to remove some hacks to avoid
access ext_pgs fields of mbufs, since they will now be in
cache.

Many thanks to glebius for helping to make this better in
the Netflix tree.

Reviewed by: hselasky, jhb, rrs, glebius (early version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24213


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# ff8c6681 13-Feb-2020 John Baldwin <jhb@FreeBSD.org>

Don't check the auth algorithm for GCM.

The upstream OpenSSL changes only set the cipher for GCM since the
authentication is redundant, and changes to OCF will soon remove the
GCM authentication algorithm constants entirely for the same reason.
In addition, ktls_create_session() already validates these fields and
wouldn't pass down an invalid auth_algorithm value to any drivers or
ktls backends.

Reviewed by: hselasky
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23671


# 7272f9cd 06-Dec-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement hardware TLS via send tags for mlx5en(4), which is supported by
ConnectX-6 DX.

Currently TLS v1.2 and v1.3 with AES 128/256 crypto over TCP/IP (v4
and v6) is supported.

A per PCI device UMA zone is used to manage the memory of the send
tags. To optimize performance some crypto contexts may be cached by
the UMA zone, until the UMA zone finishes the memory of the given send
tag.

An asynchronous task is used manage setup of the send tags towards the
firmware. Most importantly setting the AES 128/256 bit pre-shared keys
for the crypto context.

Updating the state of the AES crypto engine and encrypting data, is
all done in the fast path. Each send tag tracks the TCP sequence
number in order to detect non-contiguous blocks of data, which may
require a dump of prior unencrypted data, to restore the crypto state
prior to wire transmission.

Statistics counters have been added to count the amount of TLS data
transmitted in total, and the amount of TLS data which has been dumped
prior to transmission. When non-contiguous TCP sequence numbers are
detected, the software needs to dump the beginning of the current TLS
record up until the point of retransmission. All TLS counters utilize
the counter(9) API.

In order to enable hardware TLS offload the following sysctls must be set:
kern.ipc.mb_use_ext_pgs=1
kern.ipc.tls.ifnet.permitted=1
kern.ipc.tls.enable=1

Sponsored by: Mellanox Technologies