#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
252794 |
|
05-Jul-2013 |
andre |
MFC r242257:
Remove bogus 'else' in #ifdef that prevented the rttvar from being reset tcp_timer_rexmt() on retransmit for IPv6 sessions.
MFC r242260:
When retransmitting SYN in TCPS_SYN_SENT state use TCPTV_RTOBASE, the default retransmit timeout, as base to calculate the backoff time until next try instead of the TCP_REXMTVAL() macro which only works correctly when we already have measured an actual RTT+RTTVAR.
MFC r242263, r242264:
Add SACK_PERMIT to the list of TCP options that are switched off after retransmitting a SYN three times.
MFC r242267:
If the user has closed the socket then drop a persisting connection after a much reduced timeout.
Typically web servers close their sockets quickly under the assumption that the TCP connections goes away as well. That is not entirely true however. If the peer closed the window we're going to wait for a long time with lots of data in the send buffer.
|
#
252788 |
|
05-Jul-2013 |
andre |
MFC r226447:
Remove the ss_fltsz and ss_fltsz_local sysctl's which have long been superseded by the RFC3390 initial CWND sizing.
Also remove the remnants of TCP_METRICS_CWND which used the TCP hostcache to set the initial CWND in a non-RFC compliant way.
MFC r242249:
Adjust the initial default CWND upon connection establishment to the new and increased values specified by RFC5681 Section 3.1.
The even larger initial CWND per RFC3390, if enabled, is not affected.
MFC r242250:
When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to reduce the initial CWND to one segment. This reduction got lost some time ago due to a change in initialization ordering.
Additionally in tcp_timer_rexmt() avoid entering fast recovery when we're still in TCPS_SYN_SENT state.
MFC r242255:
Allow arbitrary MSS sizes and don't mind about the cluster size anymore. We've got more cluster sizes for quite some time now and the orginally imposed limits and the previously codified thoughts on efficiency gains are no longer true.
|
#
252555 |
|
03-Jul-2013 |
np |
MFC/backport core kernel and userspace parts of r237263 (TCP_OFFLOAD rework). MFC r237563, r239511, r243603, r245915, r245916, r245919, r245921, r245922, r245924, r245925, r245932, r245934 too.
Build tested with make universe.
r237263: - Updated TOE support in the kernel. ...
r237563: Fix clang warning when compiling iw_cxgb.
r239511: Correctly handle the case where an inp has already been dropped by the time the TOE driver reports that an active open failed. toe_connect_failed is supposed to handle this but it should be provided the inpcb instead of the tcpcb which may no longer be around.
r243603: Make sure that tcp_timer_activate() correctly sees TCP_OFFLOAD (or not).
r245915: Heed SO_NO_OFFLOAD.
r245916: Teach toe_4tuple_check() to deal with IPv6 4-tuples too.
r245919: Add TCP_OFFLOAD hook in syncache_respond for IPv6 too, just like the one that exists for IPv4.
r245921: There is no need to call into the TOE driver twice in pru_rcvd (tod_rcvd and then tod_output right after that).
r245922: Avoid NULL dereference in nd6_storelladdr when no mbuf is provided. It is called this way from a couple of places in the OFED code. (toecore calls it too but that's going to change shortly).
r245924: Move lle_event to if_llatbl.h
lle_event replaced arp_update_event after the ARP rewrite and ended up in if_ether.h simply because arp_update_event used to be there too. IPv6 neighbor discovery is going to grow lle_event support and this is a good time to move it to if_llatbl.h.
The two in-tree consumers of this event - OFED and toecore - are not affected.
r245925: Generate lle_event in the IPv6 neighbor discovery code too.
r245932: Teach toe_l2_resolve to resolve IPv6 destinations too.
r245934: Add checks for SO_NO_OFFLOAD in a couple of places that I missed earlier in r245915.
|
#
247498 |
|
28-Feb-2013 |
jhb |
MFC 245238: Add an option to not drop options from the third retransmitted SYN. If the SYNs (or SYN/ACK replies) are dropped due to network congestion, then the remote end of the connection may act as if options such as window scaling are enabled but the local end will think they are not. This can result in very slow data transfers in the case of window scaling disagreements.
Note that the unlike HEAD the existing behavior is preserved by default, but it can be disabled by setting the net.inet.tcp.rexmit_drop_options sysctl to zero.
|
#
239982 |
|
01-Sep-2012 |
trociny |
MFC r239075:
In tcp timers, check INP_DROPPED flag a little later, after callout_deactivate(), so if INP_DROPPED is set we return with the timer active flag cleared.
For me this fixes negative keep timer values reported by `netstat -x' for connections in CLOSE state.
|
#
232945 |
|
13-Mar-2012 |
glebius |
Merge 231025 from head: Add new socket options: TCP_KEEPINIT, TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT, that allow to control initial timeout, idle time, idle re-send interval and idle send count on a per-socket basis.
Reviewed by: andre, bz, lstewart
|
#
226574 |
|
20-Oct-2011 |
np |
MFC r226318:
Make sure the inp wasn't dropped when rexmt let go of the inp and pcbinfo locks.
Approved by: re (kib)
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
222488 |
|
30-May-2011 |
rwatson |
Decompose the current single inpcbinfo lock into two locks:
- The existing ipi_lock continues to protect the global inpcb list and inpcb counter. This lock is now relegated to a small number of allocation and free operations, and occasional operations that walk all connections (including, awkwardly, certain UDP multicast receive operations -- something to revisit).
- A new ipi_hash_lock protects the two inpcbinfo hash tables for looking up connections and bound sockets, manipulated using new INP_HASH_*() macros. This lock, combined with inpcb locks, protects the 4-tuple address space.
Unlike the current ipi_lock, ipi_hash_lock follows the individual inpcb connection locks, so may be acquired while manipulating a connection on which a lock is already held, avoiding the need to acquire the inpcbinfo lock preemptively when a binding change might later be required. As a result, however, lookup operations necessarily go through a reference acquire while holding the lookup lock, later acquiring an inpcb lock -- if required.
A new function in_pcblookup() looks up connections, and accepts flags indicating how to return the inpcb. Due to lock order changes, callers no longer need acquire locks before performing a lookup: the lookup routine will acquire the ipi_hash_lock as needed. In the future, it will also be able to use alternative lookup and locking strategies transparently to callers, such as pcbgroup lookup. New lookup flags are, supplementing the existing INPLOOKUP_WILDCARD flag:
INPLOOKUP_RLOCKPCB - Acquire a read lock on the returned inpcb INPLOOKUP_WLOCKPCB - Acquire a write lock on the returned inpcb
Callers must pass exactly one of these flags (for the time being).
Some notes:
- All protocols are updated to work within the new regime; especially, TCP, UDPv4, and UDPv6. pcbinfo ipi_lock acquisitions are largely eliminated, and global hash lock hold times are dramatically reduced compared to previous locking. - The TCP syncache still relies on the pcbinfo lock, something that we may want to revisit. - Support for reverting to the FreeBSD 7.x locking strategy in TCP input is no longer available -- hash lookup locks are now held only very briefly during inpcb lookup, rather than for potentially extended periods. However, the pcbinfo ipi_lock will still be acquired if a connection state might change such that a connection is added or removed. - Raw IP sockets continue to use the pcbinfo ipi_lock for protection, due to maintaining their own hash tables. - The interface in6_pcblookup_hash_locked() is maintained, which allows callers to acquire hash locks and perform one or more lookups atomically with 4-tuple allocation: this is required only for TCPv6, as there is no in6_pcbconnect_setup(), which there should be. - UDPv6 locking remains significantly more conservative than UDPv4 locking, which relates to source address selection. This needs attention, as it likely significantly reduces parallelism in this code for multithreaded socket use (such as in BIND). - In the UDPv4 and UDPv6 multicast cases, we need to revisit locking somewhat, as they relied on ipi_lock to stablise 4-tuple matches, which is no longer sufficient. A second check once the inpcb lock is held should do the trick, keeping the general case from requiring the inpcb lock for every inpcb visited. - This work reminds us that we need to revisit locking of the v4/v6 flags, which may be accessed lock-free both before and after this change. - Right now, a single lock name is used for the pcbhash lock -- this is undesirable, and probably another argument is required to take care of this (or a char array name field in the pcbinfo?).
This is not an MFC candidate for 8.x due to its impact on lookup and locking semantics. It's possible some of these issues could be worked around with compatibility wrappers, if necessary.
Reviewed by: bz Sponsored by: Juniper Networks, Inc.
|
#
221209 |
|
29-Apr-2011 |
jhb |
TCP reuses t_rxtshift to determine the backoff timer used for both the persist state and the retransmit timer. However, the code that implements "bad retransmit recovery" only checks t_rxtshift to see if an ACK has been received in during the first retransmit timeout window. As a result, if ticks has wrapped over to a negative value and a socket is in the persist state, it can incorrectly treat an ACK from the remote peer as a "bad retransmit recovery" and restore saved values such as snd_ssthresh and snd_cwnd. However, if the socket has never had a retransmit timeout, then these saved values will be zero, so snd_ssthresh and snd_cwnd will be set to 0.
If the socket is in fast recovery (this can be caused by excessive duplicate ACKs such as those fixed by 220794), then each ACK that arrives triggers either NewReno or SACK partial ACK handling which clamps snd_cwnd to be no larger than snd_ssthresh. In effect, the socket's send window is permamently stuck at 0 even though the remote peer is advertising a much larger window and pending data is only sent via TCP window probes (so one byte every few seconds).
Fix this by adding a new TCP pcb flag (TF_PREVVALID) that indicates that the various snd_*_prev fields in the pcb are valid and only perform "bad retransmit recovery" if this flag is set in the pcb. The flag is set on the first retransmit timeout that occurs and is cleared on subsequent retransmit timeouts or when entering the persist state.
Reviewed by: bz MFC after: 2 weeks
|
#
217126 |
|
07-Jan-2011 |
jhb |
Trim extra spaces before tabs.
|
#
216621 |
|
21-Dec-2010 |
jhb |
Fix a typo in a comment.
MFC after: 1 week
|
#
216101 |
|
01-Dec-2010 |
lstewart |
Pass NULL instead of 0 for the th pointer value. NULL != 0 on all platforms.
Submitted by: David Hayes <dahayes at swin edu au> MFC after: 9 weeks X-MFC with: r215166
|
#
215166 |
|
12-Nov-2010 |
lstewart |
This commit marks the first formal contribution of the "Five New TCP Congestion Control Algorithms for FreeBSD" FreeBSD Foundation funded project. More details about the project are available at: http://caia.swin.edu.au/freebsd/5cc/
- Add a KPI and supporting infrastructure to allow modular congestion control algorithms to be used in the net stack. Algorithms can maintain per-connection state if required, and connections maintain their own algorithm pointer, which allows different connections to concurrently use different algorithms. The TCP_CONGESTION socket option can be used with getsockopt()/setsockopt() to programmatically query or change the congestion control algorithm respectively from within an application at runtime.
- Integrate the framework with the TCP stack in as least intrusive a manner as possible. Care was also taken to develop the framework in a way that should allow integration with other congestion aware transport protocols (e.g. SCTP) in the future. The hope is that we will one day be able to share a single set of congestion control algorithm modules between all congestion aware transport protocols.
- Introduce a new congestion recovery (TF_CONGRECOVERY) state into the TCP stack and use it to decouple the meaning of recovery from a congestion event and recovery from packet loss (TF_FASTRECOVERY) a la RFC2581. ECN and delay based congestion control protocols don't generally need to recover from packet loss and need a different way to note a congestion recovery episode within the stack.
- Remove the net.inet.tcp.newreno sysctl, which simplifies some portions of code and ensures the stack always uses the appropriate mechanisms for recovering from packet loss during a congestion recovery episode.
- Extract the NewReno congestion control algorithm from the TCP stack and massage it into module form. NewReno is always built into the kernel and will remain the default algorithm for the forseeable future. Implementations of additional different algorithms will become available in the near future.
- Bump __FreeBSD_version to 900025 and note in UPDATING that rebuilding code that relies on the size of "struct tcpcb" is required.
Many thanks go to the Cisco University Research Program Fund at Community Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work at the Centre for Advanced Internet Architectures, Swinburne University of Technology is greatly appreciated.
In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: Cisco URP, FreeBSD Foundation Reviewed by: rpaulo Tested by: David Hayes (and many others over the years) MFC after: 3 months
|
#
205391 |
|
20-Mar-2010 |
kmacy |
- spread tcp timer callout load evenly across cpus if net.inet.tcp.per_cpu_timers is set to 1 - don't default to acquiring tcbinfo lock exclusively in rexmt
MFC after: 7 days
|
#
204830 |
|
07-Mar-2010 |
rwatson |
Locking the tcbinfo structure should not be necessary in tcp_timer_delack(), so don't.
MFC after: 1 week Reviewed by: bz Sponsored by: Juniper Networks
|
#
197244 |
|
16-Sep-2009 |
silby |
Add the ability to see TCP timers via netstat -x. This can be a useful feature when you have a seemingly stuck socket and want to figure out why it has not been closed yet.
No plans to MFC this, as it changes the netstat sysctl ABI.
Reviewed by: andre, rwatson, Eric Van Gyzen
|
#
196019 |
|
01-Aug-2009 |
rwatson |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes.
Reviewed by: bz Approved by: re (vimage blanket)
|
#
195760 |
|
19-Jul-2009 |
rwatson |
Reimplement and/or implement vnet list locking by replacing a mostly unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list.
Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock.
Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader.
Update various consumers of these KPIs based on whether they may sleep or not.
Reviewed by: bz Approved by: re (kib)
|
#
195699 |
|
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
194305 |
|
16-Jun-2009 |
jhb |
Trim extra sets of ()'s.
Requested by: bde
|
#
190948 |
|
11-Apr-2009 |
rwatson |
Update stats in struct tcpstat using two new macros, TCPSTAT_ADD() and TCPSTAT_INC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures.
MFC after: 3 days
|
#
189848 |
|
15-Mar-2009 |
rwatson |
Correct a number of evolved problems with inp_vflag and inp_flags: certain flags that should have been in inp_flags ended up in inp_vflag, meaning that they were inconsistently locked, and in one case, interpreted. Move the following flags from inp_vflag to gaps in the inp_flags space (and clean up the inp_flags constants to make gaps more obvious to future takers):
INP_TIMEWAIT INP_SOCKREF INP_ONESBCAST INP_DROPPED
Some aspects of this change have no effect on kernel ABI at all, as these are UDP/TCP/IP-internal uses; however, netstat and sockstat detect INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this into account.
MFC after: 1 week (or after dependencies are MFC'd) Reviewed by: bz
|
#
187289 |
|
15-Jan-2009 |
lstewart |
Add TCP Appropriate Byte Counting (RFC 3465) support to kernel.
The new behaviour is on by default, and can be disabled by setting the net.inet.tcp.rfc3465 sysctl to 0 to obtain previous behaviour.
The patch changes struct tcpcb in sys/netinet/tcp_var.h which breaks the ABI. Bump __FreeBSD_version to 800061 accordingly. User space tools that rely on the size of struct tcpcb (e.g. sockstat) need to be recompiled.
Reviewed by: rpaulo, gnn Approved by: gnn, kmacy (mentors) Sponsored by: FreeBSD Foundation
|
#
185571 |
|
02-Dec-2008 |
bz |
Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
|
#
183550 |
|
02-Oct-2008 |
zec |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit
Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs.
Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().
Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.).
All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*).
(*) netipsec/keysock.c did not validate depending on compile time options.
Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
181803 |
|
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
#
180631 |
|
20-Jul-2008 |
trhodes |
Document a few sysctls.
Reviewed by: rwatson
|
#
179487 |
|
02-Jun-2008 |
rwatson |
When allocating temporary storage to hold a TCP/IP packet header template, use an M_TEMP malloc(9) allocation rather than an mbuf with mtod(9) and dtom(9). This eliminates the last use of dtom(9) in TCP.
MFC after: 3 weeks
|
#
178285 |
|
17-Apr-2008 |
rwatson |
Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive.
This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code.
MFC after: 3 months Tested by: kris (superset of committered patch)
|
#
172467 |
|
07-Oct-2007 |
silby |
Add FBSDID to all files in netinet so that people can more easily include file version information in bug reports.
Approved by: re (kensmith)
|
#
172312 |
|
24-Sep-2007 |
kib |
Revert rev. 1.94. After recent tcp backouts, tcp_close() may return NULL. Check the return value of tcp_close() being NULL before dereferencing it in #ifdef TCPDEBUG block.
Reviewed by: rwatson Approved by: re (gnn)
|
#
172309 |
|
24-Sep-2007 |
silby |
Two changes:
- Reintegrate the ANSI C function declaration change from tcp_timer.c rev 1.92
- Reorganize the tcpcb structure so that it has a single pointer to the "tcp_timer" structure which contains all of the tcp timer callouts. This change means that when the single tcp timer change is reintegrated, tcpcb will not change in size, and therefore the ABI between netstat and the kernel will not change.
Neither of these changes should have any functional impact.
Reviewed by: bmah, rrs Approved by: re (bmah)
|
#
172074 |
|
07-Sep-2007 |
rwatson |
Back out tcp_timer.c:1.93 and associated changes that reimplemented the many TCP timers as a single timer, but retain the API changes necessary to reintroduce this change. This will back out the source of at least two reported problems: lock leaks in certain timer edge cases, and TCP timers continuing to fire after a connection has closed (a bug previously fixed and then reintroduced with the timer rewrite).
In a follow-up commit, some minor restylings and comment changes performed after the TCP timer rewrite will be reapplied, and a further change to allow the TCP timer rewrite to be added back without disturbing the ABI. The new design is believed to be a good thing, but the outstanding issues are leading to significant stability/correctness problems that are holding up 7.0.
This patch was generated by silby, but is being committed by proxy due to poor network connectivity for silby this week.
Approved by: re (kensmith) Submitted by: silby Tested by: rwatson, kris Problems reported by: peter, kris, others
|
#
170464 |
|
09-Jun-2007 |
andre |
Handle a race condition on >2 core machines in tcp_timer() when a timer issues a shutdown and a simultaneous close on the socket happens. This race condition is inherent in the current socket/ inpcb life cycle system but can be handled well.
Reported by: kris Tested by: kris (on 8-core machine)
|
#
170024 |
|
27-May-2007 |
rwatson |
In tcp_timer_2msl(), tp can never become NULL, so don't check it for NULL before entering tcp_trace().
Found with: Coverity Prevent(tm) CID: 1840
|
#
169608 |
|
16-May-2007 |
andre |
Move TIME_WAIT related functions and timer handling from files other than repo copied tcp_subr.c into tcp_timewait.c#1.284:
tcp_input.c#1.350 tcp_timewait() -> tcp_twcheck()
tcp_timer.c#1.92 tcp_timer_2msl_reset() -> tcp_tw_2msl_reset() tcp_timer.c#1.92 tcp_timer_2msl_stop() -> tcp_tw_2msl_stop() tcp_timer.c#1.92 tcp_timer_2msl_tw() -> tcp_tw_2msl_scan()
This is a mechanical move with appropriate renames and making them static if used only locally.
The tcp_tw_2msl_scan() cleanup function is still run from the tcp_slowtimo() in tcp_timer.c.
|
#
169454 |
|
10-May-2007 |
rwatson |
Move universally to ANSI C function declarations, with relatively consistent style(9)-ish layout.
|
#
169309 |
|
06-May-2007 |
andre |
Fix two comments.
|
#
168615 |
|
11-Apr-2007 |
andre |
Change the TCP timer system from using the callout system five times directly to a merged model where only one callout, the next to fire, is registered.
Instead of callout_reset(9) and callout_stop(9) the new function tcp_timer_activate() is used which then internally manages the callout.
The single new callout is a mutex callout on inpcb simplifying the locking a bit.
tcp_timer() is the called function which handles all race conditions in one place and then dispatches the individual timer functions.
Reviewed by: rwatson (earlier version)
|
#
168364 |
|
04-Apr-2007 |
andre |
Retire unused TCP_SACK_DEBUG.
|
#
167785 |
|
21-Mar-2007 |
andre |
ANSIfy function declarations and remove register keywords for variables. Consistently apply style to all function declarations.
|
#
167721 |
|
19-Mar-2007 |
andre |
Match up SYSCTL declaration style.
|
#
167036 |
|
26-Feb-2007 |
mohans |
Reap FIN_WAIT_2 connections marked SOCANTRCVMORE faster. This mitigate potential issues where the peer does not close, potentially leaving thousands of connections in FIN_WAIT_2. This is controlled by a new sysctl fast_finwait2_recycle, which is disabled by default.
Reviewed by: gnn, silby.
|
#
162111 |
|
07-Sep-2006 |
ru |
Back when we had T/TCP support, we used to apply different timeouts for TCP and T/TCP connections in the TIME_WAIT state, and we had two separate timed wait queues for them. Now that is has gone, the timeout is always 2*MSL again, and there is no reason to keep two queues (the first was unused anyway!).
Also, reimplement the remaining queue using a TAILQ (it was technically impossible before, with two queues).
|
#
162108 |
|
07-Sep-2006 |
ru |
Remove a microoptimization for i386 that was a micropessimization for amd64.
|
#
162064 |
|
06-Sep-2006 |
glebius |
o Backout rev. 1.125 of in_pcb.c. It appeared to behave extremely bad under high load. For example with 40k sockets and 25k tcptw entries, connect() syscall can run for seconds. Debugging showed that it iterates the cycle millions times and purges thousands of tcptw entries at a time. Besides practical unusability this change is architecturally wrong. First, in_pcblookup_local() is used in connect() and bind() syscalls. No stale entries purging shouldn't be done here. Second, it is a layering violation. o Return back the tcptw purging cycle to tcp_timer_2msl_tw(), that was removed in rev. 1.78 by rwatson. The commit log of this revision tells nothing about the reason cycle was removed. Now we need this cycle, since major cleaner of stale tcptw structures is removed. o Disable probably necessary, but now unused tcp_twrecycleable() function.
Reviewed by: ru
|
#
161226 |
|
11-Aug-2006 |
mohans |
Fixes an edge case bug in timewait handling where ticks rolling over causing the timewait expiry to be exactly 0 corrupts the timewait queues (and that entry). Reviewed by: silby
|
#
159199 |
|
03-Jun-2006 |
rwatson |
When entering a timer on a tcpcb, don't continue processing if it has been dropped. This prevents a bug introduced during the socket/pcb refcounting work from occuring, in which occasionally the retransmit timer may fire after a connection has been reset, resulting in the resulting R|A TCP packet having a source port of 0, as the port reservation has been released.
While here, fixing up some RUNLOCK->WUNLOCK bugs.
MFC after: 1 month
|
#
158644 |
|
16-May-2006 |
glebius |
- Backout one line from 1.78. The tp can be freed by tcp_drop(). - Style next line.
Coverity ID: 912
|
#
158304 |
|
05-May-2006 |
rwatson |
Only return (tw) from tcp_twclose() if reuse is passed, otherwise return NULL. In principle this shouldn't change the behavior, but avoids returning a potentially invalid/inappropriate pointer to the caller.
Found with: Coverity Prevent (tm) Submitted by: pjd MFC after: 3 months
|
#
157376 |
|
01-Apr-2006 |
rwatson |
Update TCP for infrastructural changes to the socket/pcb refcount model, pru_abort(), pru_detach(), and in_pcbdetach():
- Universally support and enforce the invariant that so_pcb is never NULL, converting dozens of unnecessary NULL checks into assertions, and eliminating dozens of unnecessary error handling cases in protocol code.
- In some cases, eliminate unnecessary pcbinfo locking, as it is no longer required to ensure so_pcb != NULL. For example, the receive code no longer requires the pcbinfo lock, and the send code only requires it if building a new connection on an otherwise unconnected socket triggered via sendto() with an address. This should significnatly reduce tcbinfo lock contention in the receive and send cases.
- In order to support the invariant that so_pcb != NULL, it is now necessary for the TCP code to not discard the tcpcb any time a connection is dropped, but instead leave the tcpcb until the socket is shutdown. This case is handled by setting INP_DROPPED, to substitute for using a NULL so_pcb to indicate that the connection has been dropped. This requires the inpcb lock, but not the pcbinfo lock.
- Unlike all other protocols in the tree, TCP may need to retain access to the socket after the file descriptor has been closed. Set SS_PROTOREF in tcp_detach() in order to prevent the socket from being freed, and add a flag, INP_SOCKREF, so that the TCP code knows whether or not it needs to free the socket when the connection finally does close. The typical case where this occurs is if close() is called on a TCP socket before all sent data in the send socket buffer has been transmitted or acknowledged. If INP_SOCKREF is found when the connection is dropped, we release the inpcb, tcpcb, and socket instead of flagging INP_DROPPED.
- Abort and detach protocol switch methods no longer return failures, nor attempt to free sockets, as the socket layer does this.
- Annotate the existence of a long-standing race in the TCP timer code, in which timers are stopped but not drained when the socket is freed, as waiting for drain may lead to deadlocks, or have to occur in a context where waiting is not permitted. This race has been handled by testing to see if the tcpcb pointer in the inpcb is NULL (and vice versa), which is not normally permitted, but may be true of a inpcb and tcpcb have been freed. Add a counter to test how often this race has actually occurred, and a large comment for each instance where we compare potentially freed memory with NULL. This will have to be fixed in the near future, but requires is to further address how to handle the timer shutdown shutdown issue.
- Several TCP calls no longer potentially free the passed inpcb/tcpcb, so no longer need to return a pointer to indicate whether the argument passed in is still valid.
- Un-macroize debugging and locking setup for various protocol switch methods for TCP, as it lead to more obscurity, and as locking becomes more customized to the methods, offers less benefit.
- Assert copyright on tcp_usrreq.c due to significant modifications that have been made as part of this work.
These changes significantly modify the memory management and connection logic of our TCP implementation, and are (as such) High Risk Changes, and likely to contain serious bugs. Please report problems to the current@ mailing list ASAP, ideally with simple test cases, and optionally, packet traces.
MFC after: 3 months
|
#
157136 |
|
25-Mar-2006 |
rwatson |
Explicitly assert socket pointer is non-NULL in tcp_input() so as to provide better debugging information.
Prefer explicit comparison to NULL for tcpcb pointers rather than treating them as booleans.
MFC after: 1 month
|
#
155758 |
|
16-Feb-2006 |
andre |
Make sysctl_msec_to_ticks(SYSCTL_HANDLER_ARGS) generally available instead of being private to tcp_timer.c.
Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
|
#
148156 |
|
19-Jul-2005 |
rwatson |
Remove no-op spl's and most comment references to spls, as TCP locking is believed to be basically done (modulo any remaining bugs).
MFC after: 3 days
|
#
146463 |
|
20-May-2005 |
ps |
Replace t_force with a t_flag (TF_FORCEDATA).
Submitted by: Raja Mukerji. Reviewed by: Mohan, Silby, Andre Opperman.
|
#
139823 |
|
06-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
139220 |
|
22-Dec-2004 |
rwatson |
Remove the now unused tcp_canceltimers() function. tcpcb timers are now stopped as part of tcp_discardcb().
MFC after: 2 weeks
|
#
139219 |
|
22-Dec-2004 |
rwatson |
Remove an annotation of a minor race relating to the update of multiple MIB entries using sysctl in short order, which might result in unexpected values for tcp_maxidle being generated by tcp_slowtimo. In practice, this will not happen, or at least, doesn't require an explicit comment.
MFC after: 2 weeks
|
#
138416 |
|
05-Dec-2004 |
rwatson |
Assert the tcptw inpcb lock in tcp_timer_2msl_reset(), as fields in the tcptw undergo non-atomic read-modify-writes.
MFC after: 2 weeks
|
#
138025 |
|
23-Nov-2004 |
rwatson |
tcp_timewait() performs multiple non-atomic reads on the tcptw structure, so assert the inpcb lock associated with the tcptw. Also assert the tcbinfo lock, as tcp_timewait() may call tcp_twclose() or tcp_2msl_rest(), which require it. Since tcp_timewait() is already called with that lock from tcp_input(), this doesn't change current locking, merely documents reasons for it.
In tcp_twstart(), assert the tcbinfo lock, as tcp_timer_2msl_rest() is called, which requires that lock.
In tcp_twclose(), assert the tcbinfo lock, as tcp_timer_2msl_stop() is called, which requires that lock.
Document the locking strategy for the time wait queues in tcp_timer.c, which consists of protecting the time wait queues in the same manner as the tcbinfo structure (using the tcbinfo lock).
In tcp_timer_2msl_reset(), assert the tcbinfo lock, as the time wait queues are modified.
In tcp_timer_2msl_stop(), assert the tcbinfo lock, as the time wait queues may be modified.
In tcp_timer_2msl_tw(), assert the tcbinfo lock, as the time wait queues may be modified.
MFC after: 2 weeks
|
#
138024 |
|
23-Nov-2004 |
rwatson |
De-spl tcp_slowtimo; tcp_maxidle assignment is subject to possible but unlikely races that could be corrected by having tcp_keepcnt and tcp_keepintvl modifications go through handler functions via sysctl, but probably is not worth doing. Updates to multiple sysctls within evaluation of a single addition are unlikely.
Annotate that tcp_canceltimers() is currently unused.
De-spl tcp_timer_delack().
De-spl tcp_timer_2msl().
MFC after: 2 weeks
|
#
137139 |
|
02-Nov-2004 |
andre |
Remove RFC1644 T/TCP support from the TCP side of the network stack.
A complete rationale and discussion is given in this message and the resulting discussion:
http://docs.freebsd.org/cgi/mid.cgi?4177C8AD.6060706
Note that this commit removes only the functional part of T/TCP from the tcp_* related functions in the kernel. Other features introduced with RFC1644 are left intact (socket layer changes, sendmsg(2) on connection oriented protocols) and are meant to be reused by a simpler and less intrusive reimplemention of the previous T/TCP functionality.
Discussed on: -arch
|
#
133874 |
|
16-Aug-2004 |
rwatson |
White space cleanup for netinet before branch:
- Trailing tab/space cleanup - Remove spurious spaces between or before tabs
This change avoids touching files that Andre likely has in his working set for PFIL hooks changes for IPFW/DUMMYNET.
Approved by: re (scottl) Submitted by: Xin LI <delphij@frontfree.net>
|
#
130989 |
|
23-Jun-2004 |
ps |
Add support for TCP Selective Acknowledgements. The work for this originated on RELENG_4 and was ported to -CURRENT.
The scoreboarding code was obtained from OpenBSD, and many of the remaining changes were inspired by OpenBSD, but not taken directly from there.
You can enable/disable sack using net.inet.tcp.do_sack. You can also limit the number of sack holes that all senders can have in the scoreboard with net.inet.tcp.sackhole_limit.
Reviewed by: gnn Obtained from: Yahoo! (Mohan Srinivasan, Jayanth Vijayaraghavan)
|
#
128019 |
|
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
#
122922 |
|
20-Nov-2003 |
andre |
Introduce tcp_hostcache and remove the tcp specific metrics from the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache.
It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve.
tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address.
It removes significant locking requirements from the tcp stack with regard to the routing table.
Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
|
#
122326 |
|
08-Nov-2003 |
sam |
use local values instead of chasing pointers
Supported by: FreeBSD Foundation
|
#
117650 |
|
15-Jul-2003 |
hsu |
Unify the "send high" and "recover" variables as specified in the lastest rev of the spec. Use an explicit flag for Fast Recovery. [1]
Fix bug with exiting Fast Recovery on a retransmit timeout diagnosed by Lu Guohan. [2]
Reviewed by: Thomas Henderson <thomas.r.henderson@boeing.com> Reported and tested by: Lu Guohan <lguohan00@mails.tsinghua.edu.cn> [2] Approved by: Thomas Henderson <thomas.r.henderson@boeing.com>, Sally Floyd <floyd@acm.org> [1]
|
#
115824 |
|
04-Jun-2003 |
hsu |
Compensate for decreasing the minimum retransmit timeout.
Reviewed by: jlemon
|
#
112009 |
|
08-Mar-2003 |
jlemon |
Remove a panic(); if the zone allocator can't provide more timewait structures, reuse the oldest one. Also move the expiry timer from a per-structure callout to the tcp slow timer.
Sponsored by: DARPA, NAI Labs
|
#
111145 |
|
19-Feb-2003 |
jlemon |
Add a TCP TIMEWAIT state which uses less space than a fullblown TCP control block. Allow the socket and tcpcb structures to be freed earlier than inpcb. Update code to understand an inp w/o a socket.
Reviewed by: hsu, silby, jayanth Sponsored by: DARPA, NAI Labs
|
#
111144 |
|
19-Feb-2003 |
jlemon |
Convert tcp_fillheaders(tp, ...) -> tcpip_fillheaders(inp, ...) so the routine does not require a tcpcb to operate. Since we no longer keep template mbufs around, move pseudo checksum out of this routine, and merge it with the length update.
Sponsored by: DARPA, NAI Labs
|
#
109175 |
|
13-Jan-2003 |
hsu |
Fix NewReno.
Reviewed by: Tom Henderson <thomas.r.henderson@boeing.com>
|
#
108265 |
|
24-Dec-2002 |
hsu |
Validate inp to prevent an use after free.
|
#
102967 |
|
05-Sep-2002 |
bde |
Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution 4 layers deep in <netinet/in_pcb.h>.
Removed unused includes. Sorted includes.
|
#
100420 |
|
20-Jul-2002 |
jdp |
Fix overflows in intermediate calculations in sysctl_msec_to_ticks(). At hz values of 1000 and above the overflows caused net.inet.tcp.keepidle to be reported as negative.
MFC after: 3 days
|
#
100335 |
|
18-Jul-2002 |
dillon |
Introduce two new sysctl's:
net.inet.tcp.rexmit_min (default 3 ticks equiv)
This sysctl is the retransmit timer RTO minimum, specified in milliseconds. This value is designed for algorithmic stability only.
net.inet.tcp.rexmit_slop (default 200ms)
This sysctl is the retransmit timer RTO slop which is added to every retransmit timeout and is designed to handle protocol stack overheads and delayed ack issues.
Note that the *original* code applied a 1-second RTO minimum but never applied real slop to the RTO calculation, so any RTO calculation over one second would have no slop and thus not account for protocol stack overheads (TCP timestamps are not a measure of protocol turnaround!). Essentially, the original code made the RTO calculation almost completely irrelevant.
Please note that the 200ms slop is debateable. This commit is not meant to be a line in the sand, and if the community winds up deciding that increasing it is the correct solution then it's easy to do. Note that larger values will destroy performance on lossy networks while smaller values may result in a greater number of unnecessary retransmits.
|
#
98102 |
|
10-Jun-2002 |
hsu |
Lock up inpcb.
Submitted by: Jennifer Yang <yangjihui@yahoo.com>
|
#
97658 |
|
31-May-2002 |
tanimura |
Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by: hsu
|
#
96972 |
|
20-May-2002 |
tanimura |
Lock down a socket, milestone 1.
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count - so_options - so_linger - so_state
o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket:
- sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup()
Reviewed by: alfred
|
#
87499 |
|
07-Dec-2001 |
rwatson |
o Our currenty userland boot code (due to rc.conf and rc.network) always enables TCP keepalives using the net.inet.tcp.always_keepalive by default. Synchronize the kernel default with the userland default.
|
#
82122 |
|
21-Aug-2001 |
silby |
Much delayed but now present: RFC 1948 style sequence numbers
In order to ensure security and functionality, RFC 1948 style initial sequence number generation has been implemented. Barring any major crypographic breakthroughs, this algorithm should be unbreakable. In addition, the problems with TIME_WAIT recycling which affect our currently used algorithm are not present.
Reviewed by: jesper
|
#
79413 |
|
08-Jul-2001 |
silby |
Temporary feature: Runtime tuneable tcp initial sequence number generation scheme. Users may now select between the currently used OpenBSD algorithm and the older random positive increment method.
While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT handling; this is causing trouble for an increasing number of folks.
To switch between generation schemes, one sets the sysctl net.inet.tcp.tcp_seq_genscheme. 0 = random positive increments, 1 = the OpenBSD algorithm. 1 is still the default.
Once a secure _and_ compatible algorithm is implemented, this sysctl will be removed.
Reviewed by: jlemon Tested by: numerous subscribers of -net
|
#
78642 |
|
23-Jun-2001 |
silby |
Eliminate the allocation of a tcp template structure for each connection. The information contained in a tcptemp can be reconstructed from a tcpcb when needed.
Previously, tcp templates required the allocation of one mbuf per connection. On large systems, this change should free up a large number of mbufs.
Reviewed by: bmilekic, jlemon, ru MFC after: 2 weeks
|
#
77539 |
|
31-May-2001 |
jesper |
Disable rfc1323 and rfc1644 TCP extensions if we havn't got any response to our third SYN to work-around some broken terminal servers (most of which have hopefully been retired) that have bad VJ header compression code which trashes TCP segments containing unknown-to-them TCP options.
PR: kern/1689 Submitted by: jesper Reviewed by: wollman MFC after: 2 weeks
|
#
75733 |
|
20-Apr-2001 |
jesper |
Say goodbye to TCP_COMPAT_42
Reviewed by: wollman Requested by: wollman
|
#
75620 |
|
17-Apr-2001 |
kris |
Note that the previous commit also restored some historical behaviour in the TCP_COMPAT_42 case (e.g. choosing '1' as the initial sequence number at boot-time, instead of randomizing it). TCP_COMPAT_42 is the repository for old security holes, too :-)
|
#
75619 |
|
17-Apr-2001 |
kris |
Randomize the TCP initial sequence numbers more thoroughly.
Obtained from: OpenBSD Reviewed by: jesper, peter, -developers
|
#
73110 |
|
26-Feb-2001 |
jlemon |
Use more aggressive retransmit timeouts for the initial SYN packet. As we currently drop the connection after 4 retransmits + 2 ICMP errors, this allows initial connection attempts to be dropped much faster.
|
#
66552 |
|
02-Oct-2000 |
jlemon |
If TCPDEBUG is defined, we could dereference a tp which was freed.
|
#
65906 |
|
15-Sep-2000 |
jlemon |
It is possible for a TCP callout to be removed from the timing wheel, but have a network interrupt arrive and deactivate the timeout before the callout routine runs. Check for this case in the callout routine; it should only run if the callout is active and not on the wheel.
|
#
62573 |
|
04-Jul-2000 |
phk |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by: bde
|
#
62454 |
|
03-Jul-2000 |
phk |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources:
-sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
#
60067 |
|
06-May-2000 |
jlemon |
Implement TCP NewReno, as documented in RFC 2582. This allows better recovery for multiple packet losses in a single window. The algorithm can be toggled via the sysctl net.inet.tcp.newreno, which defaults to "on".
Submitted by: Jayanth Vijayaraghavan <jayanth@yahoo-inc.com>
|
#
55679 |
|
09-Jan-2000 |
shin |
tcp updates to support IPv6. also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change.
Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
50705 |
|
31-Aug-1999 |
jlemon |
Simplify, and return an error if the user attempts to set a TCP time value which results in < 1 tick.
Suggested by: bde
|
#
50682 |
|
31-Aug-1999 |
jlemon |
Add a SYSCTL_PROC so that TCP timer values are now expressed to the user in ms, while they are stored internally as ticks. Note that there probably are rounding bogons here, especially on the alpha.
|
#
50673 |
|
30-Aug-1999 |
jlemon |
Restructure TCP timeout handling:
- eliminate the fast/slow timeout lists for TCP and instead use a callout entry for each timer. - increase the TCP timer granularity to HZ - implement "bad retransmit" recovery, as presented in "On Estimating End-to-End Network Path Properties", by Allman and Paxson.
Submitted by: jlemon, wollmann
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
46381 |
|
03-May-1999 |
billf |
Add sysctl descriptions to many SYSCTL_XXXs
PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)
|
#
35419 |
|
24-Apr-1998 |
dg |
Ensure that TCP_REXMTVAL doesn't return a value less than t_rttmin. This is believed to have been broken with the Brakmo/Peterson srtt calculation changes. The result of this bug is that TCP connections could time out extremely quickly (in 12 seconds). Also backed out jdp's partial fix for this problem in rev 1.17 of tcp_timer.c as it is obsoleted by this commit. Bug was pointed out by Kevin Lehey <kml@roller.nas.nasa.gov>.
PR: 6068
|
#
35056 |
|
06-Apr-1998 |
phk |
Remove the last traces of TUBA.
Inspired by: PR kern/3317
|
#
33846 |
|
26-Feb-1998 |
dg |
Changes to support the addition of a new sysctl variable: net.inet.tcp.delack_enabled Which defaults to 1 and can be set to 0 to disable TCP delayed-ack processing (i.e. all acks are immediate).
|
#
32752 |
|
25-Jan-1998 |
eivind |
Make TCP_COMPAT_42 a new style option.
|
#
29514 |
|
16-Sep-1997 |
joerg |
Make TCPDEBUG a new-style option.
|
#
27845 |
|
02-Aug-1997 |
bde |
Removed unused #includes.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
18280 |
|
13-Sep-1996 |
pst |
Make the misnamed tcp initial keepalive timer value (which is really the time, in seconds, that state for non-established TCP sessions stays about) a sysctl modifyable variable.
[part 1 of two commits, I just realized I can't play with the indices as I was typing this commit message.]
|
#
17138 |
|
12-Jul-1996 |
dg |
Fixed two bugs in previous commit: be sure to include tcp_debug.h when TCPDEBUG is defined, and fix typo in TCPDEBUG2() macro.
|
#
17096 |
|
11-Jul-1996 |
wollman |
Modify the kernel to use the new pr_usrreqs interface rather than the old pr_usrreq mechanism which was poorly designed and error-prone. This commit renames pr_usrreq to pr_ousrreq so that old code which depended on it would break in an obvious manner. This commit also implements the new interface for TCP, although the old function is left as an example (#ifdef'ed out). This commit ALSO fixes a longstanding bug in the TCP timer processing (introduced by davidg on 1995/04/12) which caused timer processing on a TCB to always stop after a single timer had expired (because it misinterpreted the return value from tcp_usrreq() to indicate that the TCB had been deleted). Finally, some code related to polling has been deleted from if.c because it is not relevant t -current and doesn't look at all like my current code.
|
#
16099 |
|
03-Jun-1996 |
jdp |
Fix a bug in the handling of the "persist" state which, under certain circumstances, caused perfectly good connections to be dropped. This happened for connections over a LAN, where the retransmit timer calculation TCP_REXMTVAL(tp) returned 0. If sending was blocked by flow control for long enough, the old code dropped the connection, even though timely replies were being received for all window probes.
Reviewed by: W. Richard Stevens <rstevens@noao.edu>
|
#
15262 |
|
15-Apr-1996 |
dg |
Two fixes from Rich Stevens:
1) Set the persist timer to help time-out connections in the CLOSING state. 2) Honor the keep-alive timer in the CLOSING state.
This fixes problems with connections getting "stuck" due to incompletion of the final connection shutdown which can be a BIG problem on busy WWW servers.
|
#
15039 |
|
04-Apr-1996 |
phk |
Add a sysctl (net.inet.tcp.always_keepalive: 0) that when set will force keepalive on all tcp sessions. Setsockopt(2) cannot override this setting. Maybe another one is needed that just changes the default for SO_KEEPALIVE ? Requested by: Joe Greco <jgreco@brasil.moneng.mei.com>
|
#
14546 |
|
11-Mar-1996 |
dg |
Move or add #include <queue.h> in preparation for upcoming struct socket changes.
|
#
13229 |
|
04-Jan-1996 |
olah |
Reverse the modification which caused the annoying m_copydata crash: set the TF_ACKNOW flag when the REXMT timer goes off to force a retransmission. In certain situations pulling snd_nxt back to snd_una is not sufficient.
|
#
12296 |
|
14-Nov-1995 |
phk |
New style sysctl & staticize alot of stuff.
|
#
12172 |
|
09-Nov-1995 |
phk |
Start adding new style sysctl here too.
|
#
12046 |
|
03-Nov-1995 |
olah |
Setting the TF_ACKNOW flag was redundant in the REXMT timeout because tcp_output() checks for the condition snd_nxt == snd_una.
Reviewed by: davidg, wollman, olah Suggested by: Richard Stevens
|
#
11150 |
|
03-Oct-1995 |
wollman |
Finish 4.4-Lite-2 merge: randomize TCP initial sequence numbers to make ISS-guessing spoofing attacks harder.
|
#
9773 |
|
29-Jul-1995 |
dg |
Add connection drop capability for persist timeouts.
Reviewed by: Andras Olah Obtained from: 4.4BSD-lite2 via W. Richard Stevens
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
7770 |
|
12-Apr-1995 |
dg |
Fixed bug I introduced when changing PCB list to use 4.4BSD style queue macros. Basically, detect 'tp' going away differently.
|
#
7684 |
|
08-Apr-1995 |
dg |
Implemented PCB hashing. Includes new functions in_pcbinshash, in_pcbrehash, and in_pcblookuphash.
|
#
6475 |
|
15-Feb-1995 |
wollman |
Transaction TCP support now standard. Hack away!
|
#
6283 |
|
09-Feb-1995 |
wollman |
Merge Transaction TCP, courtesy of Andras Olah <olah@cs.utwente.nl> and Bob Braden <braden@isi.edu>.
NB: This has not had David's TCP ACK hack re-integrated. It is not clear what the correct solution to this problem is, if any. If a better solution doesn't pop up in response to this message, I'll put David's code back in (or he's welcome to do so himself).
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|