History log of /freebsd-current/sys/ofed/drivers/infiniband/ulp/sdp/sdp_main.c
Revision Date Author Comments
# 296a4cb5 16-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: provide correct pr_shutdown for keysock and SDP

My failure to run all kinds of kernel builds lead to missing the keysock
and incorrectly assuming SDP as not having a shutdown method.

Fixes: 5bba2728079ed4da33f727dbc2b6ae1de02ba897


# cd996b84 30-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

ofed: garbage collect now unused sdp_sockaddr()

Submitted by: zlei


# 0fac350c 30-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: don't malloc/free sockaddr memory on getpeername/getsockname

Just like it was done for accept(2) in cfb1e92912b4, use same approach
for two simplier syscalls that return socket addresses. Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.

Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D42694


# cfb1e929 30-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: don't malloc/free sockaddr memory on accept(2)

Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.

While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient. Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do. Linux also does that. Update tests accordingly.

Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# e7d02be1 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: refactor protosw and domain static declaration and load

o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d18ef), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860dcd).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee54ea).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232


# 01f11ee1 16-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sdp: garbage collect sdp_ctlinput

The pr_ctlinput method was a feature of IPv4/IPv6 with exception of
pfctlinput(), which broadcasted a call to pr_ctlinput on all protocols
ever registered statically or with pf_proto_register(). Now that
this broadcast call is gone, the only protocols that get their
pr_ctlinput ever called are those that have registered itselves with
ipproto_register() or ip6proto_register().

It is entirely possible that code deleted now was dead code from very
beginning. Just a copy-paste from TCP.

Reviewed by: rstone
Differential revision: https://reviews.freebsd.org/D36208


# f277746e 12-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: change prototype for pr_control

For some reason protosw.h is used during world complation and userland
is not aware of caddr_t, a relic from the first version of C. Broken
buildworld is good reason to get rid of yet another caddr_t in kernel.

Fixes: 886fc1e80490fb03e72e306774766cbb2c733ac6


# d8596171 04-Jul-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: use only soref()/sorele() as socket reference count

o Retire SS_FDREF as it is basically a debug flag on top of already
existing soref()/sorele().
o Convert SS_PROTOREF into soref()/sorele().
o Change reference model for the listen queues, see below.
o Make sofree() private. The correct KPI to use is only sorele().
o Make soabort() respect the model and sorele() instead of sofree().

Note on listening queues. Until now the sockets on a queue had zero
reference count. And the reference were given only upon accept(2). The
assumption was that there is no way to see the queued socket from anywhere
except its head. This is not true, since queued sockets already have pcbs,
which are linked at least into the global pcb lists. With this change we
put the reference right in the sonewconn() and on accept(2) path we just
hand the existing reference to the file descriptor.

Differential revision: https://reviews.freebsd.org/D35679


# 43283184 12-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: use socket buffer mutexes in struct socket directly

Since c67f3b8b78e the sockbuf mutexes belong to the containing socket,
and socket buffers just point to it. In 74a68313b50 macros that access
this mutex directly were added. Go over the core socket code and
eliminate code that reaches the mutex by dereferencing the sockbuf
compatibility pointer.

This change requires a KPI change, as some functions were given the
sockbuf pointer only without any hint if it is a receive or send buffer.

This change doesn't cover the whole kernel, many protocols still use
compatibility pointers internally. However, it allows operation of a
protocol that doesn't use them.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35152


# 24e1c6ae 03-Jan-2022 Gleb Smirnoff <glebius@FreeBSD.org>

domains: init with standard SYSINIT(9) or VNET_SYSINIT()

There left only three modules that used dom_init(). And netipsec
was the last one to use dom_destroy().

Differential revision: https://reviews.freebsd.org/D33540


# fa0463c3 14-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: De-duplicate SBLOCKWAIT() definitions

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# f94acf52 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Rename sb(un)lock() and interlock with listen(2)

In preparation for moving sockbuf locks into the containing socket,
provide alternative macros for the sockbuf I/O locks:
SOCK_IO_SEND_(UN)LOCK() and SOCK_IO_RECV_(UN)LOCK(). These operate on a
socket rather than a socket buffer. Note that these locks are used only
to prevent concurrent readers and writters from interleaving I/O.

When locking for I/O, return an error if the socket is a listening
socket. Currently the check is racy since the sockbuf sx locks are
destroyed during the transition to a listening socket, but that will no
longer be true after some follow-up changes.

Modify a few places to check for errors from
sblock()/SOCK_IO_(SEND|RECV)_LOCK() where they were not before. In
particular, add checks to sendfile() and sorflush().

Reviewed by: tuexen, gallatin
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31657


# f161d294 02-May-2021 Mark Johnston <markj@FreeBSD.org>

Add missing sockaddr length and family validation to various protocols

Several protocol methods take a sockaddr as input. In some cases the
sockaddr lengths were not being validated, or were validated after some
out-of-bounds accesses could occur. Add requisite checking to various
protocol entry points, and convert some existing checks to assertions
where appropriate.

Reported by: syzkaller+KASAN
Reviewed by: tuexen, melifaro
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29519


# 4c0bef07 20-Jan-2021 Kyle Evans <kevans@FreeBSD.org>

kern: net: remove TCP_LINGERTIME

TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in
exactly the same form that it appears here modulo slightly different
context. It used to be the case that there was a single pr_usrreq
method with requests dispatched to it; these exact two lines appeared in
tcp_usrreq's PRU_ATTACH handling.

The only purpose of this that I can find is to cause surprising behavior
on accepted connections. Newly-created sockets will never hit these
paths as one cannot set SO_LINGER prior to socket(2). If SO_LINGER is
set on a listening socket and inherited, one would expect the timeout to
be inherited rather than changed arbitrarily like this -- noting that
SO_LINGER is nonsense on a listening socket beyond inheritance, since
they cannot be 'connected' by definition.

Neither Illumos nor Linux reset the timer like this based on testing and
inspection of Illumos, and testing of Linux.

Reviewed by: rscheff, tuexen
Differential Revision: https://reviews.freebsd.org/D28265


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 6669d545 16-Aug-2019 Conrad Meyer <cem@FreeBSD.org>

SDP: SYSCTL_DECL SDP-wide sysctl node in header

This allows use of the shared _net_inet_sdp in more than one compilation
unit. (Nothing in-tree uses this today, but some of Isilon's out-of-tree
SDP enhancements add sysctls below the node.)

Sponsored by: Dell EMC Isilon


# e12be321 20-May-2019 Conrad Meyer <cem@FreeBSD.org>

Include eventhandler.h in more compilation units

This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility. Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.

(These CUs did not show up as missing eventhandler.h in tinderbox.)

X-MFC-With: r347984


# 2f2ddd68 04-Jan-2019 Mark Johnston <markj@FreeBSD.org>

Support MSG_DONTWAIT in send*(2).

As it does for recv*(2), MSG_DONTWAIT indicates that the call should
not block, returning EAGAIN instead. Linux and OpenBSD both implement
this, so the change makes porting easier, especially since we do not
return EINVAL or so when unrecognized flags are specified.

Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: tuexen
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18728


# 79db6fe7 22-Nov-2018 Mark Johnston <markj@FreeBSD.org>

Plug some networking sysctl leaks.

Various network protocol sysctl handlers were not zero-filling their
output buffers and thus would export uninitialized stack memory to
userland. Fix a number of such handlers.

Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: tuexen
MFC after: 3 days
Security: kernel memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18301


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# c69c74b8 16-Sep-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Adapt the existing SDP ULP code to the new ibcore APIs.

Requested by: Sobczak, Bartosz <bartosz.sobczak@intel.com>
Sponsored by: Mellanox Technologies


# 779f106a 08-Jun-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Listening sockets improvements.

o Separate fields of struct socket that belong to listening from
fields that belong to normal dataflow, and unionize them. This
shrinks the structure a bit.
- Take out selinfo's from the socket buffers into the socket. The
first reason is to support braindamaged scenario when a socket is
added to kevent(2) and then listen(2) is cast on it. The second
reason is that there is future plan to make socket buffers pluggable,
so that for a dataflow socket a socket buffer can be changed, and
in this case we also want to keep same selinfos through the lifetime
of a socket.
- Remove struct struct so_accf. Since now listening stuff no longer
affects struct socket size, just move its fields into listening part
of the union.
- Provide sol_upcall field and enforce that so_upcall_set() may be called
only on a dataflow socket, which has buffers, and for listening sockets
provide solisten_upcall_set().

o Remove ACCEPT_LOCK() global.
- Add a mutex to socket, to be used instead of socket buffer lock to lock
fields of struct socket that don't belong to a socket buffer.
- Allow to acquire two socket locks, but the first one must belong to a
listening socket.
- Make soref()/sorele() to use atomic(9). This allows in some situations
to do soref() without owning socket lock. There is place for improvement
here, it is possible to make sorele() also to lock optionally.
- Most protocols aren't touched by this change, except UNIX local sockets.
See below for more information.

o Reduce copy-and-paste in kernel modules that accept connections from
listening sockets: provide function solisten_dequeue(), and use it in
the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4),
infiniband, rpc.

o UNIX local sockets.
- Removal of ACCEPT_LOCK() global uncovered several races in the UNIX
local sockets. Most races exist around spawning a new socket, when we
are connecting to a local listening socket. To cover them, we need to
hold locks on both PCBs when spawning a third one. This means holding
them across sonewconn(). This creates a LOR between pcb locks and
unp_list_lock.
- To fix the new LOR, abandon the global unp_list_lock in favor of global
unp_link_lock. Indeed, separating these two locks didn't provide us any
extra parralelism in the UNIX sockets.
- Now call into uipc_attach() may happen with unp_link_lock hold if, we
are accepting, or without unp_link_lock in case if we are just creating
a socket.
- Another problem in UNIX sockets is that uipc_close() basicly did nothing
for a listening socket. The vnode remained opened for connections. This
is fixed by removing vnode in uipc_close(). Maybe the right way would be
to do it for all sockets (not only listening), simply move the vnode
teardown from uipc_detach() to uipc_close()?

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D9770


# 817e1ad9 21-Mar-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Make sdp compilable after r315662.


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# adaf3c49 29-Jul-2016 Mark Johnston <markj@FreeBSD.org>

sdp: Destroy the RDMA ID after destroying the connection's queue pair.

This is the ordering documented by rdma_destroy_qp(). Also add a useful
KASSERT to sdp_pcbfree().

Sponsored by: EMC / Isilon Storage Division


# d3461164 29-Jul-2016 Mark Johnston <markj@FreeBSD.org>

sdp: Use malloc(9) instead of the Linux compat layer.

SDP transmit and receive rings are always created in a sleepable context,
so we can use M_WAITOK and remove error checks.

Sponsored by: EMC / Isilon Storage Division


# fa46ade8 29-Jul-2016 Mark Johnston <markj@FreeBSD.org>

sdp: Destroy the PCB lock before freeing to the zone.

Sponsored by: EMC / Isilon Storage Division


# 2cefa87b 29-Jul-2016 Mark Johnston <markj@FreeBSD.org>

sdp: Use an mbufq for received control packets.

This is simpler than the hand-rolled queue, and fixes a use-after-free.

Sponsored by: EMC / Isilon Storage Division


# 8a8f7d5b 17-Feb-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix compilation of the SDP driver and a compile warning after r278886.
Also fix the kernel build rule for mlx4_exp.c.
This fixes the LINT kernel target for amd64.

Sponsored by: Mellanox Technologies
MFC after: 1 month


# 651e4e6a 30-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by: Nginx, Inc.
Sponsored by: Netflix


# cfa6009e 12-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

In preparation of merging projects/sendfile, transform bare access to
sb_cc member of struct sockbuf to a couple of inline functions:

sbavail() and sbused()

Right now they are equal, but once notion of "not ready socket buffer data",
will be checked in, they are going to be different.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 4813ad54 28-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Compile fixes:

Remove duplicate "debug_ktr.mask" sysctl definition.
Remove now unused variable from "kern_ktr.c".
This fixes build of "ktr" which was broken by r267961.

Let the default value for "vm_kmem_size_scale" be zero. It is setup
after that the sysctl has been initialized from "getenv()" in the
"kmeminit()" function to equal the "VM_KMEM_SIZE_MAX" value, if
zero. On Sparc64 the "VM_KMEM_SIZE_MAX" macro is not a constant. This
fixes build of Sparc64 which was broken by r267961.

Add a special macro to dynamically create SYSCTL root nodes, because
root nodes have a special parent. This fixes build of existing OFED
module and CANBUS module for pc98 which was broken by r267961.

Add missing "sysctl.h" includes to get the needed sysctl header file
declarations. This is needed after r267961.

MFC after: 2 weeks


# e5fb5622 03-Apr-2013 John Baldwin <jhb@FreeBSD.org>

Check for SS_NBIO in the socket state field rather than socket buffer
flags.

Submitted by: Vijay Singh
MFC after: 1 week


# eb1b1807 05-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# 8317ddc8 19-Mar-2012 John Baldwin <jhb@FreeBSD.org>

Fix build of OFED bits with debugging options enabled.


# 6472ac3d 07-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# d745c852 06-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


# aa0a1e58 21-Mar-2011 Jeff Roberson <jeff@FreeBSD.org>

- Merge in OFED 1.5.3 from projects/ofed/head