History log of /freebsd-current/sys/netinet/cc/cc_newreno.c
Revision Date Author Comments
# f74352fb 24-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: use enum for all congestion control signals

Facilitate easier troubleshooting by enumerating
all congestion control signals. Typecast the
enum to int, when a congestion control module uses
private signals.

No external change.

Reviewed By: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43838


# fcea1cc9 14-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: fix RTO ssthresh for non-6675 pipe calculation

Follow up on D43768 to properly deal with the non-default
pipe calculation. When CC_RTO is processed, the timeout
will have already pulled back snd_nxt. Further, snd_fack
is not pulled along with snd_una.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43876


# 32a6df57 08-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: calculate ssthresh on RTO according to RFC5681

per RFC5681, only adjust ssthresh on the initital
retransmission timeout. Since RTO often happens
during loss recovery, while cwnd no longer tracks
all data in flight, calculcate pipe properly.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43768


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 69c7c811 16-Mar-2023 Randall Stewart <rrs@FreeBSD.org>

Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.

The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints
we need to move to using the new inline functions. This adds them and moves rack to now use
the tcp_tracepoints.

Reviewed by: tuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D38831


# e2e088ae 14-Dec-2022 Randall Stewart <rrs@FreeBSD.org>

Rack cannot be loaded without cc_newreno compiled into the kernel.

Right now rack will fail to load due to its hack in accessing symbol names
in cc_newreno. This was fine when newreno was always compiled into the
kernel but now ... not so much. Instead lets fix up rack to use the socket
option queries to get the information it wants and set the parameters. We
also fix the CC parameter so they are always settable.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D37622


# 9eb0e832 08-Nov-2022 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: provide macros to access inpcb and socket from a tcpcb

There should be no functional changes with this commit.

Reviewed by: rscheff
Differential revision: https://reviews.freebsd.org/D37123


# 0fdc2472 25-Sep-2022 Michael Tuexen <tuexen@FreeBSD.org>

tcp: make RACK loadable again using the default configuration

Without this patch, loading the RACK stack required the newreno
CC module to be compiled into the kernel. This is not the case
anymore since CUBIC is the default now.

Reviewed by: rscheff@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D36707


# 653cf466 01-Apr-2022 Randall Stewart <rrs@FreeBSD.org>

hystart++ may not properly exit CSS back to slowstart.

In the changes to get hystart++ into cubic an inadvertent line
was removed in the conditional to figure out if you need to exit
hystart++ back to slowstart. The line of course is the most crucial
one (the others are valid but not critical) i.e. is the new rtt
less than the point where we entered hystart++. Without the line
we end up bouncing in and out of CSS.

Reported By: Reese Enghardt
Sponsored By: Netflix Inc.


# a9696510 07-Feb-2022 Randall Stewart <rrs@FreeBSD.org>

tcp: Add hystart++ to our cubic implementation.

As promised to the transport call on 11/4/22 here is an implementation
of hystart++ for cubic. It also cleans up the tcp_congestion function
to have a better name. Common variables are moved into the general
cc.h structure so that both cubic and newreno can use them for
hystart++

Reviewed by: Michael Tuexen, Richard Scheffenegger
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D33035


# b8d60729 11-Nov-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Congestion control cleanup.

NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693


# b15b0535 25-Oct-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: allow new reno functions to be called from other CC modules

Some new reno functions use the internal data, but are also called
from functions of other CC modules. Ensure that in this case, the
internal data is not accessed.

Reported by: syzbot+1d219ea351caa5109d4b@syzkaller.appspotmail.com
Reported by: syzbot+b08144f8cad9c67258c5@syzkaller.appspotmail.com
Reviewed by: rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D32649


# 4e4c84f8 22-Oct-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Add hystart-plus to cc_newreno and rack.

TCP Hystart draft version -03:
https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-hystartplusplus

Is a new version of hystart that allows one to carefully exit slow start if the RTT
spikes too much. The newer version has a slower-slow-start so to speak that then
kicks in for five round trips. To see if you exited too early, if not into congestion avoidance.
This commit will add that feature to our newreno CC and add the needed bits in rack to
be able to enable it.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32373


# fa3746be 11-Jun-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: fix two bugs in new reno

* Completely initialise the CC module specific data
* Use beta_ecn in case of an ECN event whenever ABE is enabled
or it is requested by the stack.

Reviewed by: rscheff, rrs
MFC after: 3 days
Sponsored by: Netflix, Inc.


# c358f185 26-May-2021 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: Use local CC data only in the correct context

Most CC algos do use local data, and when calling
newreno_cong_signal from there, the latter misinterprets
the data as its own struct, leading to incorrect behavior.

Reported by: chengc_netapp.com
Reviewed By: chengc_netapp.com, tuexen, #transport
MFC after: 3 days
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D30470


# 5d8fd932 06-May-2021 Randall Stewart <rrs@FreeBSD.org>

This brings into sync FreeBSD with the netflix versions of rack and bbr.
This fixes several breakages (panics) since the tcp_lro code was
committed that have been reported. Quite a few new features are
now in rack (prefecting of DGP -- Dynamic Goodput Pacing among the
largest). There is also support for ack-war prevention. Documents
comming soon on rack..

Sponsored by: Netflix
Reviewed by: rscheff, mtuexen
Differential Revision: https://reviews.freebsd.org/D30036


# d1de2b05 17-Apr-2021 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: Rename rfc6675_pipe to sack.revised, and enable by default

As full support of RFC6675 is in place, deprecating
net.inet.tcp.rfc6675_pipe and enabling by default
net.inet.tcp.sack.revised.

Reviewed By: #transport, kbowling, rrs
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28702


# 39a12f01 24-Oct-2020 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: move cwnd and ssthresh updates into cc modules

This will pave the way of setting ssthresh differently in TCP CUBIC, according
to RFC8312 section 4.7.

No functional change, only code movement.

Submitted by: chengc_netapp.com
Reviewed by: rrs, tuexen, rscheff
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26807


# 1567c937 25-Sep-2020 Richard Scheffenegger <rscheff@FreeBSD.org>

TCP newreno: improve after_idle ssthresh

Adjust ssthresh in after_idle to the maximum of
the prior ssthresh, or 3/4 of the prior cwnd. See
RFC2861 section 2 for an in depth explanation for
the rationale around this.

As newreno is the default "fall-through" reaction,
most tcp variants will benefit from this.

Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D22438


# 66ba9aaf 20-Jul-2020 Richard Scheffenegger <rscheff@FreeBSD.org>

Add MODULE_VERSION to TCP loadable congestion control modules.

Without versioning information, using preexisting loader /
linker code is not easily possible when another module may
have dependencies on pre-loaded modules, and also doesn't
allow the automatic loading of dependent modules.

No functional change of the actual modules.

Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25744


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 5cc11a89 09-May-2019 Michael Tuexen <tuexen@FreeBSD.org>

Prevent cwnd to collapse down to 1 MSS after exiting recovery.

This is descrined in RFC 6582, which updates RFC 3782.

Submitted by: Richard Scheffenegger
Reviewed by: lstewart@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D17614


# 7dc90a1d 25-Jan-2019 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug in the restart window computation of TCP New Reno

When implementing support for IW10, an update in the computation
of the restart window used after an idle phase was missed. To
minimize code duplication, implement the logic in tcp_compute_initwnd()
and call it. This fixes a bug in NewReno, which was not aware of
IW10.

Submitted by: Richard Scheffenegger
Reviewed by: tuexen@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18940


# 855acb84 15-Dec-2018 Brooks Davis <brooks@FreeBSD.org>

Fix bugs in plugable CC algorithm and siftr sysctls.

Use the sysctl_handle_int() handler to write out the old value and read
the new value into a temporary variable. Use the temporary variable
for any checks of values rather than using the CAST_PTR_INT() macro on
req->newptr. The prior usage read directly from userspace memory if the
sysctl() was called correctly. This is unsafe and doesn't work at all on
some architectures (at least i386.)

In some cases, the code could also be tricked into reading from kernel
memory and leaking limited information about the contents or crashing
the system. This was true for CDG, newreno, and siftr on all platforms
and true for i386 in all cases. The impact of this bug is largest in
VIMAGE jails which have been configured to allow writing to these
sysctls.

Per discussion with the security officer, we will not be issuing an
advisory for this issue as root access and a non-default config are
required to be impacted.

Reviewed by: markj, bz
Discussed with: gordon (security officer)
MFC after: 3 days
Security: kernel information leak, local DoS (both require root)
Differential Revision: https://reviews.freebsd.org/D18443


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# 22699887 21-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

NULL out cc_data in pluggable TCP {cc}_cb_destroy

When ABE was added (rS331214) to NewReno and leak fixed (rS333699) , it now has
a destructor (newreno_cb_destroy) for per connection state. Other congestion
controls may allocate and free cc_data on entry and exit, but the field is
never explicitly NULLed if moving back to NewReno which only internally
allocates stateful data (no entry contstructor) resulting in a situation where
newreno_cb_destory might be called on a junk pointer.

- NULL out cc_data in the framework after calling {cc}_cb_destroy
- free(9) checks for NULL so there is no need to perform not NULL checks
before calling free.
- Improve a comment about NewReno in tcp_ccalgounload

This is the result of a debugging session from Jason Wolfe, Jason Eggleston,
and mmacy@ and very helpful insight from lstewart@.

Submitted by: Kevin Bowling
Reviewed by: lstewart
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16282


# 9891578a 16-May-2018 Lawrence Stewart <lstewart@FreeBSD.org>

Plug a memory leak and potential NULL-pointer dereference introduced in r331214.

Each TCP connection that uses the system default cc_newreno(4) congestion
control algorithm module leaks a "struct newreno" (8 bytes of memory) at
connection initialisation time. The NULL-pointer dereference is only germane
when using the ABE feature, which is disabled by default.

While at it:

- Defer the allocation of memory until it is actually needed given that ABE is
optional and disabled by default.

- Document the ENOMEM errno in getsockopt(2)/setsockopt(2).

- Document ENOMEM and ENOBUFS in tcp(4) as being synonymous given that they are
used interchangeably throughout the code.

- Fix a few other nits also accidentally omitted from the original patch.

Reported by: Harsh Jain on freebsd-net@
Tested by: tjh@
Differential Revision: https://reviews.freebsd.org/D15358


# 370efe5a 19-Mar-2018 Lawrence Stewart <lstewart@FreeBSD.org>

Add support for the experimental Internet-Draft "TCP Alternative Backoff with
ECN (ABE)" proposal to the New Reno congestion control algorithm module.
ABE reduces the amount of congestion window reduction in response to
ECN-signalled congestion relative to the loss-inferred congestion response.

More details about ABE can be found in the Internet-Draft:
https://tools.ietf.org/html/draft-ietf-tcpm-alternativebackoff-ecn

The implementation introduces four new sysctls:

- net.inet.tcp.cc.abe defaults to 0 (disabled) and can be set to non-zero to
enable ABE for ECN-enabled TCP connections.

- net.inet.tcp.cc.newreno.beta and net.inet.tcp.cc.newreno.beta_ecn set the
multiplicative window decrease factor, specified as a percentage, applied to
the congestion window in response to a loss-based or ECN-based congestion
signal respectively. They default to the values specified in the draft i.e.
beta=50 and beta_ecn=80.

- net.inet.tcp.cc.abe_frlossreduce defaults to 0 (disabled) and can be set to
non-zero to enable the use of standard beta (50% by default) when repairing
loss during an ECN-signalled congestion recovery episode. It enables a more
conservative congestion response and is provided for the purposes of
experimentation as a result of some discussion at IETF 100 in Singapore.

The values of beta and beta_ecn can also be set per-connection by way of the
TCP_CCALGOOPT TCP-level socket option and the new CC_NEWRENO_BETA or
CC_NEWRENO_BETA_ECN CC algo sub-options.

Submitted by: Tom Jones <tj@enoti.me>
Tested by: Tom Jones <tj@enoti.me>, Grenville Armitage <garmitage@swin.edu.au>
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D11616


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 43053c12 25-Jul-2017 Sean Bruno <sbruno@FreeBSD.org>

Revert r307901 - Inform CC modules about loss events.

This was discussed between various transport@ members and it was
requested to be reverted and discussed.

Submitted by: Kevin Bowling <kevin.bowling@kev009.com>
Reported by: lawrence
Reviewed by: hiren
Sponsored by: Limelight Networks


# 5d53981a 25-Jul-2017 Sean Bruno <sbruno@FreeBSD.org>

Revert r308180 - Set slow start threshold more accurrately on loss ...

This was discussed between various transport@ members and it was
requested to be reverted and discussed.

Submitted by: kevin
Reported by: lawerence
Reviewed by: hiren


# e04310d5 01-Nov-2016 Hiren Panchasara <hiren@FreeBSD.org>

Set slow start threshold more accurately on loss to be flightsize/2 instead of
cwnd/2 as recommended by RFC5681. (spotted by mmacy at nextbsd dot org)

Restore pre-r307901 behavior of aligning ssthresh/cwnd on mss boundary. (spotted
by slawa at zxy dot spb dot ru)

Tested by: dim, Slawa <slawa at zxy dot spb dot ru>
MFC after: 1 month
X-MFC with: r307901
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D8349


# 4e7f7553 24-Oct-2016 Hiren Panchasara <hiren@FreeBSD.org>

FreeBSD tcp stack used to inform respective congestion control module about the
loss event but not use or obay the recommendations i.e. values set by it in some
cases.

Here is an attempt to solve that confusion by following relevant RFCs/drafts.
Stack only sets congestion window/slow start threshold values when there is no
CC module availalbe to take that action. All CC modules are inspected and
updated when needed to take appropriate action on loss.

tcp_stacks/fastpath module has been updated to adapt these changes.

Note: Probably, the most significant change would be to not bring congestion
window down to 1MSS on a loss signaled by 3-duplicate acks and letting
respective CC decide that value.

In collaboration with: Matt Macy <mmacy at nextbsd dot org>
Discussed on: transport@ mailing list
Reviewed by: jtl
MFC after: 1 month
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D8225


# dd13b7d3 24-Oct-2016 Hiren Panchasara <hiren@FreeBSD.org>

Undo r307899. It needs a bit more work and proper commit log.


# 95d82360 24-Oct-2016 Hiren Panchasara <hiren@FreeBSD.org>

In Collaboration with: Matt Macy <mmacy at nextbsd dot com>
Reviewed by: jtl
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D8225


# 4b7b743c 25-Aug-2016 Lawrence Stewart <lstewart@FreeBSD.org>

Pass the number of segments coalesced by LRO up the stack by repurposing the
tso_segsz pkthdr field during RX processing, and use the information in TCP for
more correct accounting and as a congestion control input. This is only a start,
and an audit of other uses for the data is left as future work.

Reviewed by: gallatin, rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D7564


# 4644fda3 27-Jan-2016 Gleb Smirnoff <glebius@FreeBSD.org>

Rename netinet/tcp_cc.h to netinet/cc/cc.h.

Discussed with: lstewart


# 2de3e790 21-Jan-2016 Gleb Smirnoff <glebius@FreeBSD.org>

- Rename cc.h to more meaningful tcp_cc.h.
- Declare it a kernel only include, which it already is.
- Don't include tcp.h implicitly from tcp_cc.h


# a934d061 09-Dec-2015 Hiren Panchasara <hiren@FreeBSD.org>

Add an option to use rfc6675 based pipe/inflight bytes calculation in newreno.

MFC after: 3 weeks
Sponsored by: Limelight Networks


# 891b8ed4 12-Apr-2011 Lawrence Stewart <lstewart@FreeBSD.org>

Use the full and proper company name for Swinburne University of Technology
throughout the source tree.

Requested by: Grenville Armitage, Director of CAIA at Swinburne University of
Technology
MFC after: 3 days


# 03f0843b 01-Feb-2011 Lawrence Stewart <lstewart@FreeBSD.org>

Algorithm modules can define their own private congestion signal types in the
top 8 bits of the 32 bit signal bit field space for internal use. These private
signals should not be leaked outside of a module.

Given that many algorithm modules use the NewReno hook functions to simplify
their implementation, the obvious place such a leak would show up is in the
NewReno cong_signal hook function.

- Show the full number of significant bits in the signal type definitions in
<netinet/cc.h>.

- Add a bitmask to simplify figuring out if a given signal is in the private or
public bit range.

- Add a sanity check in newreno_cong_signal() to ensure private signals are not
being leaked into the hook function.

Sponsored by: FreeBSD Foundation
Discussed with: David Hayes <dahayes at swin edu au>
MFC after: 1 week
X-MFC with: r215166


# 74a5a194 01-Dec-2010 Lawrence Stewart <lstewart@FreeBSD.org>

General cleanup of the NewReno CC module (no functional changes):

- Remove superfluous includes and unhelpful comments.

- Alphabetically order functions.

- Make functions static.

Sponsored by: FreeBSD Foundation
MFC after: 9 weeks
X-MFC with: r215166


# 2ea8da28 01-Dec-2010 Lawrence Stewart <lstewart@FreeBSD.org>

- Reinstantiate the after_idle hook call in tcp_output(), which got lost
somewhere along the way due to mismerging r211464 in our development tree.

- Capture the essence of r211464 in NewReno's after_idle() hook. We don't
use V_ss_fltsz/V_ss_fltsz_local yet which needs to be revisited.

Sponsored by: FreeBSD Foundation
Submitted by: David Hayes <dahayes at swin edu au>
MFC after: 9 weeks
X-MFC with: r215166


# dbc42409 11-Nov-2010 Lawrence Stewart <lstewart@FreeBSD.org>

This commit marks the first formal contribution of the "Five New TCP Congestion
Control Algorithms for FreeBSD" FreeBSD Foundation funded project. More details
about the project are available at: http://caia.swin.edu.au/freebsd/5cc/

- Add a KPI and supporting infrastructure to allow modular congestion control
algorithms to be used in the net stack. Algorithms can maintain per-connection
state if required, and connections maintain their own algorithm pointer, which
allows different connections to concurrently use different algorithms. The
TCP_CONGESTION socket option can be used with getsockopt()/setsockopt() to
programmatically query or change the congestion control algorithm respectively
from within an application at runtime.

- Integrate the framework with the TCP stack in as least intrusive a manner as
possible. Care was also taken to develop the framework in a way that should
allow integration with other congestion aware transport protocols (e.g. SCTP)
in the future. The hope is that we will one day be able to share a single set
of congestion control algorithm modules between all congestion aware transport
protocols.

- Introduce a new congestion recovery (TF_CONGRECOVERY) state into the TCP stack
and use it to decouple the meaning of recovery from a congestion event and
recovery from packet loss (TF_FASTRECOVERY) a la RFC2581. ECN and delay based
congestion control protocols don't generally need to recover from packet loss
and need a different way to note a congestion recovery episode within the
stack.

- Remove the net.inet.tcp.newreno sysctl, which simplifies some portions of code
and ensures the stack always uses the appropriate mechanisms for recovering
from packet loss during a congestion recovery episode.

- Extract the NewReno congestion control algorithm from the TCP stack and
massage it into module form. NewReno is always built into the kernel and will
remain the default algorithm for the forseeable future. Implementations of
additional different algorithms will become available in the near future.

- Bump __FreeBSD_version to 900025 and note in UPDATING that rebuilding code
that relies on the size of "struct tcpcb" is required.

Many thanks go to the Cisco University Research Program Fund at Community
Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work
at the Centre for Advanced Internet Architectures, Swinburne University of
Technology is greatly appreciated.

In collaboration with: David Hayes <dahayes at swin edu au> and
Grenville Armitage <garmitage at swin edu au>
Sponsored by: Cisco URP, FreeBSD Foundation
Reviewed by: rpaulo
Tested by: David Hayes (and many others over the years)
MFC after: 3 months