History log of /openbsd-current/sys/kern/sys_socket.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.65 30-Apr-2024 mvs

Push solock() down to sosend() for SOCK_RAW sockets.

Raw sockets are the simplest inet sockets, so use them to start landing
`sb_mtx' mutex(9) protection for `so_snd' buffer. Now solock() is taken
only around pru_send*(), the rest of sosend() serialized by sblock() and
`sb_mtx'. The unlocked SS_ISCONNECTED check is fine, because
rip{,6}_send() check it. Also, previously the SS_ISCONNECTED could be
lost due to solock() release around following m_getuio().

ok bluhm


# 1.64 11-Apr-2024 mvs

Take solock_shared() in soo_stat().

Only unix(4) and tcp(4) sockets set (*pru_sence)() handler. The rest of
soo_stat() is the read only access.

ok bluhm


# 1.63 31-Mar-2024 mvs

Mark `so_rcv' sockbuf of udp(4) sockets as SB_OWNLOCK.

sbappend*() and soreceive() of SB_MTXLOCK marked sockets uses `sb_mtx'
mutex(9) for protection, meanwhile buffer usage check and corresponding
sbwait() sleep still serialized by solock(). Mark udp(4) as SB_OWNLOCK
to avoid solock() serialization and rely to `sb_mtx' mutex(9). The
`sb_state' and `sb_flags' modifications must be protected by `sb_mtx'
too.

ok bluhm


# 1.62 26-Mar-2024 mvs

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm


Revision tags: OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.61 15-Apr-2023 kn

return directly to drop needless error variable; OK mvs


Revision tags: OPENBSD_7_3_BASE
# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.64 11-Apr-2024 mvs

Take solock_shared() in soo_stat().

Only unix(4) and tcp(4) sockets set (*pru_sence)() handler. The rest of
soo_stat() is the read only access.

ok bluhm


# 1.63 31-Mar-2024 mvs

Mark `so_rcv' sockbuf of udp(4) sockets as SB_OWNLOCK.

sbappend*() and soreceive() of SB_MTXLOCK marked sockets uses `sb_mtx'
mutex(9) for protection, meanwhile buffer usage check and corresponding
sbwait() sleep still serialized by solock(). Mark udp(4) as SB_OWNLOCK
to avoid solock() serialization and rely to `sb_mtx' mutex(9). The
`sb_state' and `sb_flags' modifications must be protected by `sb_mtx'
too.

ok bluhm


# 1.62 26-Mar-2024 mvs

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm


Revision tags: OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.61 15-Apr-2023 kn

return directly to drop needless error variable; OK mvs


Revision tags: OPENBSD_7_3_BASE
# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.63 31-Mar-2024 mvs

Mark `so_rcv' sockbuf of udp(4) sockets as SB_OWNLOCK.

sbappend*() and soreceive() of SB_MTXLOCK marked sockets uses `sb_mtx'
mutex(9) for protection, meanwhile buffer usage check and corresponding
sbwait() sleep still serialized by solock(). Mark udp(4) as SB_OWNLOCK
to avoid solock() serialization and rely to `sb_mtx' mutex(9). The
`sb_state' and `sb_flags' modifications must be protected by `sb_mtx'
too.

ok bluhm


# 1.62 26-Mar-2024 mvs

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm


Revision tags: OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.61 15-Apr-2023 kn

return directly to drop needless error variable; OK mvs


Revision tags: OPENBSD_7_3_BASE
# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.62 26-Mar-2024 mvs

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm


Revision tags: OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.61 15-Apr-2023 kn

return directly to drop needless error variable; OK mvs


Revision tags: OPENBSD_7_3_BASE
# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.61 15-Apr-2023 kn

return directly to drop needless error variable; OK mvs


Revision tags: OPENBSD_7_3_BASE
# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.60 22-Jan-2023 mvs

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@


# 1.59 21-Jan-2023 mvs

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.58 12-Dec-2022 tb

Revert sb_state changes to unbreak tree.


# 1.57 11-Dec-2022 mvs

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.56 19-Nov-2022 kn

Push kernel lock into pru_control() aka. in6_control() / in_control()

so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().

OK mvs


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.55 08-Nov-2022 kn

Push kernel lock down into ifioctl()

This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.

This allows us to unlock ioctls one by one.

OK mpi


Revision tags: OPENBSD_7_2_BASE
# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.54 02-Sep-2022 mvs

Move PRU_CONTROL request to (*pru_control)().

The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.

Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.

ok guenther@ bluhm@


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.53 14-Aug-2022 jsg

remove unneeded includes in sys/kern
ok mpi@ miod@


# 1.52 13-Aug-2022 mvs

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.51 20-Jun-2022 visa

Remove unused struct fileops field fo_poll and callbacks.

OK mpi@


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.50 06-Jun-2022 claudio

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@


Revision tags: OPENBSD_7_1_BASE
# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.49 25-Feb-2022 guenther

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.48 25-Feb-2022 guenther

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.47 24-Oct-2021 jsg

use NULL not 0 for pointer values in kern
ok semarie@


Revision tags: OPENBSD_7_0_BASE
# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.46 07-Jun-2021 mpi

Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.

This socket flag was redundant with the socket buffer one.

ok mvs@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE
# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.45 22-Feb-2020 anton

In preparation for unlocking ioctl(2), grab the kernel lock as needed.

ok kettenis@ mpi@ visa@


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.44 08-Jan-2020 visa

Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP and
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of
the ID parameter inside the sigio code. Also add cases for FIOSETOWN
and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before.
These changes allow removing the ID translation from sys_fcntl() and
sys_ioctl().

Idea from NetBSD

OK mpi@, claudio@


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.43 05-Jan-2020 visa

Constify instances of struct fileops.

OK anton@, mpi@, bluhm@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.42 19-Nov-2018 visa

Utilize sigio with sockets.

OK mpi@


Revision tags: OPENBSD_6_4_BASE
# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.41 20-Aug-2018 mpi

Reorder checks in the read/write(2) family of syscalls to prepare making
file operations mp-safe.

This change makes it clear that `f_offset' is only accessed in vn_read()
and vn_write(), which will help taking it out of the KERNEL_LOCK().

This refactoring uncovered a race in vn_read() which is now documented
and will be addressed in a later diff.

ok visa@


# 1.40 30-Jul-2018 mpi

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.39 10-Jul-2018 mpi

Move socket & pipe specific logic in their ioctl handler.

ok visa@, tb@


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.38 06-Jun-2018 mpi

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.37 26-Apr-2018 pirofti

Remove solock() surrounding PRU_CONTROL in soo_ioctl().

We do not need the lock there.
Missed this in my former commit pushing NET_LOCK() down the stack.

Found the hard way by naddy@, sorry!

OK mpi@.


# 1.36 10-Apr-2018 mpi

Convert 'struct fileops' definitions to C99.

ok millert@, deraadt@, florian@


Revision tags: OPENBSD_6_3_BASE
# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.35 10-Dec-2017 mpi

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@


# 1.34 14-Nov-2017 tb

Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().
In particular, this allows SIOCGIF* requests to run in parallel.

lots of help & ok mpi, ok visa, sashan


Revision tags: OPENBSD_6_2_BASE
# 1.33 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.32 24-Jul-2017 mpi

Grab the socket lock in soo_ioctl() where `so_state', `so_rcv'
and `so_snd' are modified.

ok bluhm@, visa@


# 1.31 20-Jul-2017 mpi

Extend the scope of the socket lock in soo_stat() to protect `so_state'
and `so_rcv'.

ok bluhm@, claudio@, visa@


Revision tags: OPENBSD_6_1_BASE
# 1.30 22-Feb-2017 mpi

Do not grab the NET_LOCK() when poll(2)ing on unix domain sockets.

Fix the 'X freeze' while scanning with wireless interfaces. Problem
reported by pirofti@.

ok tb@, bluhm@


# 1.29 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.28 31-Jan-2017 mpi

Remove the inifioctl hack, checking for an unheld NET_LOCK() in
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.


# 1.27 25-Jan-2017 mpi

Introduce a hack to remove false-positives when looking for memory
allocation that can sleep while holding the NET_LOCK().

To be removed once we're confident the remaining code paths are safe.

Discussed with deraadt@


# 1.26 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.25 22-Nov-2016 mpi

Enforce that ifioctl() is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@


# 1.24 21-Nov-2016 mpi

Kill rtioctl() stub, returning EOPNOTSUPP since tree import.

ok jsg@


# 1.23 21-Nov-2016 mpi

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@


# 1.22 06-Oct-2016 bluhm

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.21 05-Dec-2015 tedu

remove stale lint annotations


Revision tags: OPENBSD_5_8_BASE
# 1.20 13-May-2015 jsg

test mbuf pointers against NULL not 0
ok krw@ miod@


Revision tags: OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.19 13-Jul-2014 tedu

bzero -> memset. for the speeds.


# 1.18 30-Mar-2014 guenther

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@


Revision tags: OPENBSD_5_5_BASE
# 1.17 28-Sep-2013 millert

poll(2) on a socket should set POLLHUP on EOF. This makes the
behavior of socketpair(2) match that of pipe(2) when the other end
is closed. OK guenther@


Revision tags: OPENBSD_5_4_BASE
# 1.16 05-Apr-2013 tedu

remove some obsolete casts


Revision tags: OPENBSD_5_3_BASE
# 1.15 15-Jan-2013 bluhm

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations. To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field. sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE OPENBSD_4_8_BASE OPENBSD_4_9_BASE OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.14 22-Feb-2009 otto

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@


# 1.13 02-Oct-2008 deraadt

A closed, disconnected, or otherwise failed socket is still a socket
and should return stat information instead of EINVAL from deep in the
guts of tcp_usrreq. While there, put some more information into struct
stat, inspired by FreeBSD. EINVAL problem reported in PR 5943


Revision tags: OPENBSD_4_4_BASE
# 1.12 23-May-2008 thib

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.11 26-Feb-2007 kurt

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types. w/help from deraadt@.
okay deraadt@ claudio@


Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.10 13-Dec-2005 jsg

ansi/deregister. No binary change.


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B
# 1.9 23-Sep-2003 millert

Replace select backends with poll backends. selscan() and pollscan()
now call the poll backend. With this change we implement greater
poll(2) functionality instead of emulating it via the select backend.
Adapted from NetBSD and including some changes from FreeBSD.
Tested by many, deraadt@ OK


Revision tags: OPENBSD_3_4_BASE
# 1.8 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_0_BASE OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_BASE UBC_SYNC_A UBC_SYNC_B
# 1.7 14-May-2001 art

Add a fo_stat member to struct fileops. Used soon.
Also add a stat function for kqueue from FreeBSD.


# 1.6 14-May-2001 art

More generic arguments to soo_stat.


Revision tags: OPENBSD_2_9_BASE
# 1.5 01-Mar-2001 provos

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE
# 1.4 19-Apr-2000 csapuntz

Change struct file interface methods read and write to pass file offset in
and out.

Make pread/pwrite in netbsd & linux thread safe - which is the whole point
anyway.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE OPENBSD_2_4_BASE OPENBSD_2_5_BASE OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 31-Aug-1997 deraadt

branches: 1.3.12;
for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.


Revision tags: OPENBSD_2_1_BASE
# 1.2 24-Feb-1997 niklas

OpenBSD tags


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision