History log of /freebsd-current/sys/dev/iscsi/icl_soft.c
Revision Date Author Comments
# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 02a226ac 27-Jul-2022 Dimitry Andric <dim@FreeBSD.org>

Suppress possible unused variable warning for icl_soft.c

With clang 15, the following -Werror warning is produced on i386:

sys/dev/iscsi//icl_soft.c:1277:6: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
int i;
^

The 'i' variable is used later in the icl_soft_conn_pdu_get_bio()
function, via the PHYS_TO_DMAP() macro. However, on i386 and some other
architectures, this macro is defined to panic immediately, so in those
cases, 'i' is indeed not used. Suppress the warning by marking 'i' as
unused.

MFC after: 3 days


# f4f84701 24-Jul-2022 Dimitry Andric <dim@FreeBSD.org>

Fix unused variable warning in icl_soft.c

With clang 15, the following -Werror warning is produced:

sys/dev/iscsi//icl_soft.c:886:6: error: variable 'coalesced' set but not used [-Werror,-Wunused-but-set-variable]
int coalesced, error;
^

The 'coalesced' variable is eventually used only in an #if 0'd block,
obviously meant for debugging. Ensure that 'coalesced' is only declared
and used when DEBUG_COALESCED is defined, so the debugging can be easily
turned on later, if desired.

MFC after: 3 days


# 7b02c1e8 18-Apr-2022 John Baldwin <jhb@FreeBSD.org>

iscsi: Fetch limits based on a socket rather than assuming global limits.

cxgbei needs the ability to return different limits based on the
connection (e.g. if the connection is over a T5 adapter or a T6
adapter as well as factoring in the MTU).

This change plumbs through the changes in the ioctls without changing
any of the backends. The limits callback passed to icl_register now
accepts a second socket argument which holds the integer file
descriptor. To support ABI compatiblity for old binaries, the
callback should return "global" values if the socket fd is zero.

The CTL_ISCSI_LIMITS argument used with CTL_ISCSI by ctld(8) now
accepts the socket fd in a field that was previously part of a
reserved spare field. Old binaries zero this request which results in
passing a socket fd of 0 to the limits callback.

The ISCSIDREQUEST ioctl no longer returns limits. Instead, iscsid(8)
invokes a new ISCSIDLIMITS ioctl after establishing the connection via
connect(2). For ABI compat, if the old ISCSIDREQUEST is invoked, the
global limits are still fetched (with a socket fd of 0) and returned.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D34928


# 832acea9 10-Mar-2022 John Baldwin <jhb@FreeBSD.org>

icl_soft: Use PHYS_TO_DMAP instead of pmap_map_io_transient.

The latter API is not actually MI but is only supported on amd64,
arm64, and RISC-V.

Sponsored by: Chelsio Communications


# 530e725d 10-Mar-2022 John Baldwin <jhb@FreeBSD.org>

iscsi: Support unmapped I/O requests in the default initiator.

- Add icl_pdu_append_bio and icl_pdu_get_bio methods.

- When ICL_NOCOPY is used to append data from an unmapped I/O request
to a PDU, construct unmapped mbufs from the relevant pages backing
the struct bio.

- Use m_apply with a helper to compute crc32 digests on mbuf chains
to handle unmapped mbufs. Since m_apply requires PMAP_HAS_DMAP
for unmapped mbufs, only support unmapped requests when PMAP_HAS_DMAP
is true.

Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D34406


# 8903d8e3 04-Jan-2022 John Baldwin <jhb@FreeBSD.org>

iscsi: Pass the request PDU to icl_conn_transfer_setup().

This matches icl_conn_task_setup() which passes the PDU and avoids the
need for a layering violation in cxgbei to fetch the request PDU from
the ctl_io.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D33746


# e900338c 05-Nov-2021 John Baldwin <jhb@FreeBSD.org>

Move the ICL_CONN_*LOCK* macros to <dev/iscsi/icl.h>.

These macros are not backend-specific but reference a
backend-independent field in struct icl_conn.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32858


# 87322a90 05-Aug-2021 John Baldwin <jhb@FreeBSD.org>

iscsi: Remove icl_soft-only fields from struct icl_conn.

Create a struct icl_soft_conn which extends struct icl_conn and
move fields only used by icl_soft from struct icl_conn to
struct icl_soft_conn.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31414


# a1002174 14-Jun-2021 Mark Johnston <markj@FreeBSD.org>

Consistently use the SOCKBUF_MTX() and SOCK_MTX() macros

This makes it easier to change the socket locking protocols. No
functional change intended.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# 0cc7d64a 20-May-2021 John Baldwin <jhb@FreeBSD.org>

iscsi: Move the maximum data segment limits into 'struct icl_conn'.

This fixes a few bugs in iSCSI backends where the backends were using
the limits they advertised initially during the login phase as the
final values instead of the values negotiated with the other end.

Reported by: Jithesh Arakkan @ Chelsio
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D30271


# afc3e54e 03-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Move ic_check_send_space clear to the actual check.

It closes tiny race when the flag could be set between being cleared
and the space is checked, that would create us some more work. The
flag setting is protected by both locks, so we can clear it in either
place, but in between both locks are dropped.

MFC after: 1 week


# aff9b9ee 03-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Restore condition removed in df3747c6607b.

I think it allowed to avoid some TX thread wakeups while the socket
buffer is full. But add there another options if ic_check_send_space
is set, which means socket just reported that new space appeared, so
it may have sense to pull more data from ic_to_send for better TX
coalescing.

MFC after: 1 week


# df3747c6 02-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Replace STAILQ_SWAP() with simpler STAILQ_CONCAT().

Also remove stray STAILQ_REMOVE_AFTER(), not causing problems only
because STAILQ_SWAP() fixed corrupted stqh_last.

MFC after: 1 week


# 06e9c710 02-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Fix initiator panic after 6895f89fe54e.

There are sessions without socket that are not disconnecting yet.

MFC after: 3 weeks


# b85a67f5 01-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Optimize TX coalescing by keeping pointer to last mbuf.

Before m_cat() each time traversed through all the coalesced chain.

MFC after: 1 week


# 6895f89f 21-Feb-2021 Alexander Motin <mav@FreeBSD.org>

Coalesce socket reads in software iSCSI.

Instead of 2-4 socket reads per PDU this can do as low as one read
per megabyte, dramatically reducing TCP overhead and lock contention.

With this on iSCSI target I can write more than 4GB/s through a
single connection.

MFC after: 1 month


# b75168ed 28-Jan-2021 Alexander Motin <mav@FreeBSD.org>

Make software iSCSI more configurable.

Move software iSCSI tunables/sysctls into kern.icl.soft subtree.
Replace several hardcoded length constants there with variables.

While there, stretch the limits to better match Linux' open-iscsi
and our own initiator with new MAXPHYS of 1MB. Our CTL target is
also optimized for up to 1MB I/Os, so there is also a match now.
For Windows 10 and VMware 6.7 initiators at default settings it
should make no change, since previous limits were sufficient there.

Tests of QD1 1MB writes from FreeBSD over 10GigE link show throughput
increase by 29% on idle connection and 132% with concurrent QD8 reads.

MFC after: 3 days
Sponsored by: iXsystems, Inc.


# ff751ee0 20-Jan-2021 Alexander Motin <mav@FreeBSD.org>

Remove FirstBurstLength limit for software iSCSI.

For hardware offload solicited data may potentially be handled more
efficiently than unsolicited due to direct data placement. Or there
can be some unsolicited write buffering limitations. It may create
situations where FirstBurstLength limit is really useful.

Software driver though has no those factors, having to do memcopy()
any way and having no so hard limit on the temporary storage. Same
time more active use of unsolicited transfers allows to avoid some
of Ready To Transfer (R2T) PDU round-trip times and processing.

This change effectively doubles from 64KB to 128KB the maximum size
of write command that can be transferred within one link RTT. Tests
of (64KB, 128KB] QD1 writes mixed with simultaneous QD8 reads over
the same connection, increasing RTT, shows almost double write speed
and half latency, while we should be able to afford few megabytes of
RAM for additional buffering on a target these days.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 6b3a9a0f 11-Jan-2021 Mateusz Guzik <mjg@FreeBSD.org>

Convert remaining cap_rights_init users to cap_rights_init_one

semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)


# bce7ee9d 28-Oct-2020 Edward Tomasz Napierala <trasz@FreeBSD.org>

Drop "All rights reserved" from all my stuff. This includes
Foundation copyrights, approved by emaste@. It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.

Reviewed by: emaste, imp, gbe (manpages)
Differential Revision: https://reviews.freebsd.org/D26980


# 2140d5b6 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

iscsi: clean up empty lines in .c and .h files


# 9a4510ac 08-Jun-2020 Alexander Motin <mav@FreeBSD.org>

Implement zero-copy iSCSI target transmission/read.

Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter. Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed. In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC. Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with: Chelsio
Sponsored by: iXsystems, Inc.


# 1f29b46c 22-May-2020 Alexander Motin <mav@FreeBSD.org>

Do not try to fill socket send buffer to the last byte.

Setting so_snd.sb_lowat to at least 1/8 of the socket buffer size allows
send thread more actively use PDUs coalescing, that dramatically reduces
TCP lock congestion and number of context switches, when the socket is
full and PDUs are small.

MFC after: 1 week
Sponsored by: iXsystems, Inc.


# f89d2072 17-Jun-2019 Xin LI <delphij@FreeBSD.org>

Separate kernel crc32() implementation to its own header (gsb_crc32.h) and
rename the source to gsb_crc32.c.

This is a prerequisite of unifying kernel zlib instances.

PR: 229763
Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision: https://reviews.freebsd.org/D20193


# 43ee6e9d 24-Jan-2018 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add SPDX tags to iscsi(4).

MFC after: 2 weeks


# 22d3bb26 09-Dec-2017 Edward Tomasz Napierala <trasz@FreeBSD.org>

Move the DIAGNOSTIC check for lost iSCSI PDUs from icl_conn_close()
to icl_conn_free(). It's perfectly valid for the counter to be non-zero
in the former.

MFC after: 2 weeks
Sponsored by: playkey.net


# 9ac7c5a6 23-Nov-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.

Discussed with: np@ and trasz@
Sponsored by: Mellanox Technologies
MFC after: 1 week


# 82f7fa7a 02-Mar-2017 Alexander Motin <mav@FreeBSD.org>

Inline some trivial wrapper functions.

MFC after: 2 weeks


# 605703b5 15-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Fix handling of negative sbspace() return values.

I found that at least with Chelsio NICs TOE sockets quite often report
negative sbspace() values. Using unsigned variable to store it resulted
in attempts to aggregate too much data in one sosend() call, that caused
errors and following connection termination.

MFC after: 2 weeks


# 33d9db92 14-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Directly call m_gethdr() instead of m_getm2() for BHS.

All this code is based on assumption that data will be stored in one piece,
and since buffer size if known and fixed, it is easier to hardcode it.

MFC after: 2 weeks


# 875ac6cf 14-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Temporary attach AHS to BHS to calculate header digest.

MFC after: 2 weeks


# d0d587c7 14-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Do not rely on data alignment after m_pullup().

In general case m_pullup() does not really guarantee any data alignment.
Instead of depenting on side effects caused by data being always copied
out of mbuf cluster (which is probably a bug by itself), always allocate
aligned BHS buffer and read data there directly from socket.

While there, reuse new icl_conn_receive_buf() function to read digests.
The code could probably be even more optimized to aggregate those reads,
but until that done, this is still easier then the way it was before.

MFC after: 2 weeks


# 898fd11f 13-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Remove M_PKTHDR from m_getm2() in icl_pdu_append_data().

ip_data_mbuf is always appended to ip_bhs_mbuf, so it does not need own
packet header. This change first avoids allocation/initialization of the
header, and then avoids dropping one when it later gets to socket buffer.

MFC after: 2 weeks


# 97b84d34 24-Aug-2016 Navdeep Parhar <np@FreeBSD.org>

Make the iSCSI parameter negotiation more flexible.

Decouple the send and receive limits on the amount of data in a single
iSCSI PDU. MaxRecvDataSegmentLength is declarative, not negotiated, and
is direction-specific so there is no reason for both ends to limit
themselves to the same min(initiator, target) value in both directions.

Allow iSCSI drivers to report their send, receive, first burst, and max
burst limits explicitly instead of using hardcoded values or trying to
derive all of them from the receive limit (which was the only limit
reported by the drivers prior to this change).

Display the send and receive limits separately in the userspace iSCSI
utilities.

Reviewed by: jpaetzel@ (earlier version), trasz@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7279


# b8911594 24-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add mechanism for choosing iSER-capable ICL modules.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 7deb68ab 21-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Provide a way for ICL modules to declare they support PIM_UNMAPPED.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 906a424b 20-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Call the ICL module's handoff method even when using ICL proxy.
The upcoming iSER code uses this.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# f41492b0 17-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add icl_conn_connect() ICL method, required for iSER.

Obtained from: Mellanox Technologies (earlier version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 604c023f 17-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Extend the ICL interface to include the PDU pointer in the task_setup
method. This is required for upcoming iSER support.

Obtained from: Mellanox Technologies (earlier version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 266078c6 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

dev/iscsi: minor spelling fixes.

No functional change.

Reviewed by: trasz


# 5b157f21 15-May-2015 Alexander Motin <mav@FreeBSD.org>

Close some potential races around socket start/close.

There are some reports about panics on ic->ic_socket NULL derefence.
This kind of races is the only way I can imagine it to happen.

MFC after: 2 weeks


# aca050aa 04-Apr-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove icl_conn_connected(); was unused.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 7a03d007 08-Feb-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Extend ICL to add receive offload methods. For software ICL backend
they are no-ops.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# d4b195d3 08-Feb-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Make output of "iscsictl -v" and "ctladm islist -v" a little prettier
by capitalizing "None".

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 5aabcd7c 07-Feb-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Tidy up; no functional changes.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 82babffb 04-Feb-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Make it possible to set (via iscsi.conf(5)) and query (via iscsictl -v)
initiator iSCSI offload. Pass maximum data segment size supported by
chosen offload module to iscsid(8), and make iscsid(8) not try to negotiate
anything larger than that.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 872d2d92 31-Jan-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Use proper module name in MODULE_VERSION().

Sponsored by: The FreeBSD Foundation


# 321b17ec 31-Jan-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add kobj interface between ICL and the rest of the iSCSI stack.
Review note - icl.c was moved to icl_soft.c.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation