#
337975 |
|
17-Aug-2018 |
markj |
MFC r337328: Don't check rcv sockbuf limits when sending on a unix stream socket.
PR: 181741, 212812
|
#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
296277 |
|
01-Mar-2016 |
jhb |
Refactor the AIO subsystem to permit file-type-specific handling and improve cancellation robustness.
Introduce a new file operation, fo_aio_queue, which is responsible for queueing and completing an asynchronous I/O request for a given file. The AIO subystem now exports library of routines to manipulate AIO requests as well as the ability to run a handler function in the "default" pool of AIO daemons to service a request.
A default implementation for file types which do not include an fo_aio_queue method queues requests to the "default" pool invoking the fo_read or fo_write methods as before.
The AIO subsystem permits file types to install a private "cancel" routine when a request is queued to permit safe dequeueing and cleanup of cancelled requests.
Sockets now use their own pool of AIO daemons and service per-socket requests in FIFO order. Socket requests will not block indefinitely permitting timely cancellation of all requests.
Due to the now-tight coupling of the AIO subsystem with file types, the AIO subsystem is now a standard part of all kernels. The VFS_AIO kernel option and aio.ko module are gone.
Many file types may block indefinitely in their fo_read or fo_write callbacks resulting in a hung AIO daemon. This can result in hung user processes (when processes attempt to cancel all outstanding requests during exit) or a hung system. To protect against this, AIO requests are only permitted for known "safe" files by default. AIO requests for all file types can be enabled by setting the new vfs.aio.enable_usafe sysctl to a non-zero value. The AIO tests have been updated to skip operations on unsafe file types if the sysctl is zero.
Currently, AIO requests on sockets and raw disks are considered safe and are enabled by default. aio_mlock() is also enabled by default.
Reviewed by: cem, jilles Discussed with: kib (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5289
|
#
295676 |
|
16-Feb-2016 |
jhb |
The locking annotations for struct sockbuf originally used the key from struct socket. When sockbuf.h was moved out of socketvar.h, the locking key was no longer nearby. Instead, add a new key for sockbuf and use a single item for the socket buffer lock instead of separate entries for receive vs send buffers.
Reviewed by: adrian Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4901
|
#
293432 |
|
08-Jan-2016 |
glebius |
Make it possible for sbappend() to preserve M_NOTREADY on mbufs, just like sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM sockets, due to sendfile(2) supporting only the latter, there is a corner case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake of control data, albeit being stream socket.
Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY, similar to m_demote().
|
#
278729 |
|
13-Feb-2015 |
sjg |
sbspace: size of bleft, mleft must match sockbuf fields to avoid overflow on amd64
Submitted by: anshukla@juniper.net Obtained from: Juniper Networks
|
#
275329 |
|
30-Nov-2014 |
glebius |
Merge from projects/sendfile: extend protocols API to support sending not ready data: o Add new flag to pru_send() flags - PRUS_NOTREADY. o Add new protocol method pru_ready().
Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
275326 |
|
30-Nov-2014 |
glebius |
Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers. These mbufs are now being populated by some I/O in background and are referenced outside. This forces following implications: - An mbuf which is "not ready" can't be taken out of the buffer. - An mbuf that is behind a "not ready" in the queue neither. - If sockbet buffer is flushed, then "not ready" mbufs shouln't be freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc. The sb_ccc stands for ""claimed character count", or "committed character count". And the sb_acc is "available character count". Consumers of socket buffer API shouldn't already access them directly, but use sbused() and sbavail() respectively. o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones with M_BLOCKED. o New field sb_fnrdy points to the first not ready mbuf, to avoid linear search. o New function sbready() is provided to activate certain amount of mbufs in a socket buffer.
A special note on SCTP: SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet allow protocol specific sockbufs. Thus, SCTP does some hacks to make itself compatible with FreeBSD: it manages sockbufs on its own, but keeps sb_cc updated to inform the stack of amount of data in them. The new notion of "not ready" data isn't supported by SCTP. Instead, only a mechanical substitute is done: s/sb_cc/sb_ccc/. A proper solution would be to take away struct sockbuf from struct socket and allow protocols to implement their own socket buffers, like SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
275315 |
|
30-Nov-2014 |
glebius |
- Move sbcheck() declaration under SOCKBUF_DEBUG. - Improve SOCKBUF_DEBUG macros. - Improve sbcheck().
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
275312 |
|
30-Nov-2014 |
glebius |
Make sballoc() and sbfree() functions. Ideally, they could be marked as static, but unfortunately Infiniband (ab)uses them.
Sponsored by: Nginx, Inc.
|
#
274421 |
|
12-Nov-2014 |
glebius |
In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions:
sbavail() and sbused()
Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
271946 |
|
22-Sep-2014 |
hselasky |
Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of bytes in an mbuf chain into account and does not limit by the number of mbufs in a chain. Some kinds of hardware is limited by two factors. One is the fragment length and the second is the fragment count. Both of these limits need to be taken into account when doing TSO. Else some kinds of hardware might have to drop completely valid mbuf chains because they cannot loaded into the given hardware's DMA engine. The new way of doing TSO limitation has been made backwards compatible as input from other FreeBSD developers and will use defaults for values not set.
Reviewed by: adrian, rmacklem Sponsored by: Mellanox Technologies MFC after: 1 week
|
#
263116 |
|
13-Mar-2014 |
asomers |
Replace 4.4BSD Lite's unix domain socket backpressure hack with a cleaner mechanism, based on the new SB_STOP sockbuf flag. The old hack dynamically changed the sending sockbuf's high water mark whenever adding or removing data from the receiving sockbuf. It worked for stream sockets, but it never worked for SOCK_SEQPACKET sockets because of their atomic nature. If the sockbuf was partially full, it might return EMSGSIZE instead of blocking.
The new solution is based on DragonFlyBSD's fix from commit 3a6117bbe0ed6a87605c1e43e12a1438d8844380 on 2008-05-27. It adds an SB_STOP flag to sockbufs. Whenever uipc_send surpasses the socket's size limit, it sets SB_STOP on the sending sockbuf. sbspace() will then return 0 for that sockbuf, causing sosend_generic and friends to block. uipc_rcvd will likewise clear SB_STOP. There are two fringe benefits: uipc_{send,rcvd} no longer need to call chgsbsize() on every send and receive because they don't change the sockbuf's high water mark. Also, uipc_sense no longer needs to acquire the UIPC linkage lock, because it's simpler to compute the st_blksizes.
There is one drawback: since sbspace() will only ever return 0 or the maximum, sosend_generic will allow the sockbuf to exceed its nominal maximum size by at most one packet of size less than the max. I don't think that's a serious problem. In fact, I'm not even positive that FreeBSD guarantees a socket will always stay within its nominal size limit.
sys/sys/sockbuf.h Add the SB_STOP flag and adjust sbspace()
sys/sys/unpcb.h Delete the obsolete unp_cc and unp_mbcnt fields from struct unpcb.
sys/kern/uipc_usrreq.c Adjust uipc_rcvd, uipc_send, and uipc_sense to use the SB_STOP backpressure mechanism. Removing obsolete unpcb fields from db_show_unpcb.
tests/sys/kern/unix_seqpacket_test.c Clear expected failures from ATF.
Obtained from: DragonFly BSD PR: kern/185812 Reviewed by: silence from freebsd-net@ and rwatson@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation
|
#
262915 |
|
07-Mar-2014 |
asomers |
Partial revert of change 262914. I screwed up subversion syntax with perforce syntax and committed some unrelated files. Only devd files should've been committed.
Reported by: imp Pointy hat to: asomers MFC after: 3 weeks X-MFC-With: r262914
|
#
262914 |
|
07-Mar-2014 |
asomers |
sbin/devd/devd.8 sbin/devd/devd.cc Add a -q flag to devd that will suppress syslog logging at LOG_NOTICE or below.
Requested by: ian@ and imp@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation
|
#
262867 |
|
06-Mar-2014 |
asomers |
Fix PR kern/185813 "SOCK_SEQPACKET AF_UNIX sockets with asymmetrical buffers drop packets". It was caused by a check for the space available in a sockbuf, but it was checking the wrong sockbuf.
sys/sys/sockbuf.h sys/kern/uipc_sockbuf.c Add sbappendaddr_nospacecheck_locked(), which is just like sbappendaddr_locked but doesn't validate the receiving socket's space. Factor out common code into sbappendaddr_locked_internal(). We shouldn't simply make sbappendaddr_locked check the space and then call sbappendaddr_nospacecheck_locked, because that would cause the O(n) function m_length to be called twice.
sys/kern/uipc_usrreq.c Use sbappendaddr_nospacecheck_locked for SOCK_SEQPACKET sockets, because the receiving sockbuf's size limit is irrelevant.
tests/sys/kern/unix_seqpacket_test.c Now that 185813 is fixed, pipe_128k_8k fails intermittently due to 185812. Make it fail every time by adding a usleep after starting the writer thread and before starting the reader thread in test_pipe. That gives the writer time to fill up its send buffer. Also, clear the expected failure message due to 185813. It actually said "185812", but that was a typo.
PR: kern/185813 Reviewed by: silence from freebsd-net@ and rwatson@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation
|
#
256185 |
|
09-Oct-2013 |
glebius |
- Substitute sbdrop_internal() with sbcut_internal(). The latter doesn't free mbufs, but return chain of free mbufs to a caller. Caller can either reuse them or return to allocator in a batch manner. - Implement sbdrop()/sbdrop_locked() as a wrapper around sbcut_internal(). - Expose sbcut_locked() for outside usage.
Sponsored by: Netflix Sponsored by: Nginx, Inc. Approved by: re (marius)
|
#
255138 |
|
01-Sep-2013 |
davide |
Fix socket buffer timeouts precision using the new sbintime_t KPI instead of relying on the tvtohz() workaround. The latter has been introduced lately by jhb@ (r254699) in order to have a fix that can be backported to STABLE.
Reported by: Vitja Makarov <vitja.makarov at gmail dot com> Reviewed by: jhb (earlier version)
|
#
225169 |
|
25-Aug-2011 |
bz |
Increase the defaults for the maximum socket buffer limit, and the maximum TCP send and receive buffer limits from 256kB to 2MB.
For sb_max_adj we need to add the cast as already used in the sysctl handler to not overflow the type doing the maths.
Note that this is just the defaults. They will allow more memory to be consumed per socket/connection if needed but not change the default "idle" memory consumption. All values are still tunable by sysctls.
Suggested by: gnn Discussed on: arch (Mar and Aug 2011) MFC after: 3 weeks Approved by: re (kib)
|
#
193272 |
|
01-Jun-2009 |
jhb |
Rework socket upcalls to close some races with setup/teardown of upcalls. - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call.
Discussed with: rwatson, rmacklem
|
#
181066 |
|
31-Jul-2008 |
kmacy |
move sockbuf locking macros in to sockbuf.h
|
#
180972 |
|
29-Jul-2008 |
kmacy |
remove redundant ifdef ... lol
|
#
180971 |
|
29-Jul-2008 |
kmacy |
fix build by forward declaring thread and hiding socket buffer definitions from user code
|
#
180970 |
|
29-Jul-2008 |
cognet |
Unbreak the build by protecting kernel-only functions with #ifdef _KERNEL.
|
#
180948 |
|
29-Jul-2008 |
kmacy |
Factor sockbuf, sockopt, and sockstate out of socketvar.h in to separate headers.
Reviewed by: rwatson MFC after: 3 days
|