#
285830 |
|
23-Jul-2015 |
gjb |
- Copy stable/10@285827 to releng/10.2 in preparation for 10.2-RC1 builds. - Update newvers.sh to reflect RC1. - Update __FreeBSD_version to reflect 10.2. - Update default pkg(8) configuration to use the quarterly branch.[1]
Discussed with: re, portmgr [1] Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
284496 |
|
17-Jun-2015 |
hselasky |
MFC r280991: Extend fixes made in r278103 and r38754 by copying the complete packet header and not only partial flags and fields. Firewalls can attach classification tags to the outgoing mbufs which should be copied to all the new fragments. Else only the first fragment will be let through by the firewall. This can easily be tested by sending a large ping packet through a firewall. It was also discovered that VLAN related flags and fields should be copied for packets traversing through VLANs. This is all handled by "m_dup_pkthdr()".
Regarding the MAC policy check in ip_fragment(), the tag provided by the originating mbuf is copied instead of using the default one provided by m_gethdr().
Tested by: Karim Fodil-Lemelin <fodillemlinkarim at gmail.com> Sponsored by: Mellanox Technologies PR: 7802
|
#
281955 |
|
24-Apr-2015 |
hiren |
MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@
r275358: Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file.
This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows.
"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before.
r275483: Remove M_FLOWID from SCTP code.
r276982: Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr manpage.
Note: The FreeBSD version has been bumped.
Reviewed by: hps, tuexen Sponsored by: Limelight Networks
|
#
280552 |
|
25-Mar-2015 |
hselasky |
MFC r279281: Fix a special case in ip_fragment() to produce a more sensible chain of packets. When the data payload length excluding any headers, of an outgoing IPv4 packet exceeds PAGE_SIZE bytes, a special case in ip_fragment() can kick in to optimise the outgoing payload(s). The code which was added in r98849 as part of zero copy socket support assumes that the beginning of any MTU sized payload is aligned to where a MBUF's "m_data" pointer points. This is not always the case and can sometimes cause large IPv4 packets, as part of ping replies, to be split more than needed.
Instead of iterating the MBUFs to figure out how much data is in the current chain, use the value already in the "m_pkthdr.len" field of the first MBUF in the chain.
Reviewed by: ken @ Differential Revision: https://reviews.freebsd.org/D1893 Sponsored by: Mellanox Technologies
|
#
278514 |
|
10-Feb-2015 |
hselasky |
Append to the MFC of r278103 that we also pass along the M_FLOWID flag.
Sponsored by: Mellanox Technologies
|
#
278511 |
|
10-Feb-2015 |
hselasky |
MFC r278103: The flowid and hashtype should be copied from the originating packet when fragmenting IP packets to preserve the order of the packets in a stream. Else the resulting fragments can be sent out of order when the hardware supports multiple transmit rings.
Sponsored by: Mellanox Technologies
|
#
268956 |
|
21-Jul-2014 |
np |
MFC r268450 (by glebius). The leak affects stable/10 too.
In several cases in ip_output() we obtain reference on ifa. Do not leak it.
|
#
267736 |
|
22-Jun-2014 |
tuexen |
MFC r265691:
For some UDP packets (for example with 200 byte payload) and IP options, the IP header and the UDP header are not in the same mbuf. Add code to in_delayed_cksum() to deal with this case.
MFC r265713:
Use KASSERTs as suggested by glebius@
|
#
263478 |
|
21-Mar-2014 |
glebius |
Merge r262763, r262767, r262771, r262806 from head: - Remove rt_metrics_lite and simply put its members into rtentry. - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This removes another cache trashing ++ from packet forwarding path. - Create zini/fini methods for the rtentry UMA zone. Via initialize mutex and counter in them. - Fix reporting of rmx_pksent to routing socket. - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
|
#
263334 |
|
19-Mar-2014 |
glebius |
Merge r262747: remove extraneous ifa_ref()/ifa_free().
|
#
262743 |
|
04-Mar-2014 |
glebius |
Merge r261582, r261601, r261610, r261613, r261627, r261640, r261641, r261823, r261825, r261859, r261875, r261883, r261911, r262027, r262028, r262029, r262030, r262162 from head.
Large flowtable revamp. See commit messages for merged revisions for details.
Sponsored by: Netflix
|
#
261545 |
|
06-Feb-2014 |
ae |
MFC r260702 (by melifaro): Fix ipfw fwd for IPv4 traffic broken by r249894.
Problem case: Original lookup returns route with GW set, so gw points to rte->rt_gateway. After that we're changing dst and performing lookup another time. Since fwd host is most probably directly reachable, resulting rte does not contain rt_gateway, so gw is not set. Finally, we end with packet transmitted to proper interface but wrong link-layer address.
|
#
260319 |
|
05-Jan-2014 |
glebius |
Merge r260188 from head: Fix regression from r249894. Now we pass "gw" as argument to if_output method, thus for multicast case we need it to point at "dst".
PR: 185395
|
#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
#
254889 |
|
25-Aug-2013 |
markj |
Implement the ip, tcp, and udp DTrace providers. The probe definitions use dynamic translation so that their arguments match the definitions for these providers in Solaris and illumos. Thus, existing scripts for these providers should work unmodified on FreeBSD.
Tested by: gnn, hiren MFC after: 1 month
|
#
254523 |
|
19-Aug-2013 |
andre |
Add m_clrprotoflags() to clear protocol specific mbuf flags at up and downwards layer crossings.
Consistently use it within IP, IPv6 and ethernet protocols.
Discussed with: trociny, glebius
|
#
254517 |
|
19-Aug-2013 |
andre |
Remove unused M_FRAG, M_FIRSTFRAG and M_LASTFRAG tagging from ip_fragment(). There wasn't any real driver (and hardware) support for it. Modern hardware does full fragmentation/segmentation offload instead.
|
#
252710 |
|
04-Jul-2013 |
trociny |
In r227207, to fix the issue with possible NULL inp_socket pointer dereferencing, when checking for SO_REUSEPORT option (and SO_REUSEADDR for multicast), INP_REUSEPORT flag was introduced to cache the socket option. It was decided then that one flag would be enough to cache both SO_REUSEPORT and SO_REUSEADDR: when processing SO_REUSEADDR setsockopt(2), it was checked if it was called for a multicast address and INP_REUSEPORT was set accordingly.
Unfortunately that approach does not work when setsockopt(2) is called before binding to a multicast address: the multicast check fails and INP_REUSEPORT is not set.
Fix this by adding INP_REUSEADDR flag to unconditionally cache SO_REUSEADDR.
PR: 179901 Submitted by: Michael Gmelin freebsd grem.de (initial version) Reviewed by: rwatson MFC after: 1 week
|
#
249925 |
|
26-Apr-2013 |
glebius |
Add const qualifier to the dst parameter of the ifnet if_output method.
|
#
249894 |
|
25-Apr-2013 |
glebius |
Introduce a pointer to const variable gw, which points either at the same place as dst, or to the sockaddr in the routing table.
The const constraint of gw makes us safe from modifing routing table accidentially. And "onstantness" of dst allows us to remove several bandaids, when we switched it back at &ro->ro_dst, now it always points there.
Reviewed by: rrs
|
#
249848 |
|
24-Apr-2013 |
rrs |
This fixes the issue with the "randomly changing" default route. What it was is there are two places in ip_output.c where we do a goto again. One place was fine, it copies out the new address and then resets dst = ro->rt_dst; But the other place does *not* do that, which means earlier when we found the gateway, we have dst pointing there aka dst = ro->rt_gateway is done.. then we do a goto again.. bam now we clobber the default route.
The fix is just to move the again so we are always doing dst = &ro->rt_dst; in the again loop.
PR: 174749,157796 MFC after: 1 week
|
#
248324 |
|
15-Mar-2013 |
glebius |
Use m_get/m_gethdr instead of compat macros.
Sponsored by: Nginx, Inc.
|
#
243882 |
|
05-Dec-2012 |
glebius |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys.
Exceptions:
- sys/contrib not touched - sys/mbuf.h edited manually
|
#
243624 |
|
27-Nov-2012 |
andre |
Remove unused and unnecessary CSUM_IP_FRAGS checksumming capability. Checksumming the IP header of fragments is no different from doing normal IP headers.
Discussed with: yongari MFC after: 1 week
|
#
242463 |
|
02-Nov-2012 |
ae |
Remove the recently added sysctl variable net.pfil.forward. Instead, add protocol specific mbuf flags M_IP_NEXTHOP and M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup only when this flag is set.
Suggested by: andre
|
#
242161 |
|
26-Oct-2012 |
glebius |
o Remove last argument to ip_fragment(), and obtain all needed information on checksums directly from mbuf flags. This simplifies code. o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in hardware. Some driver may not announce CSUM_IP in theur if_hwassist, although try to do checksums if CSUM_IP set on mbuf. Example is em(4). o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP. After this change CSUM_DELAY_IP vanishes from the stack.
Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>
|
#
242079 |
|
25-Oct-2012 |
ae |
Remove the IPFIREWALL_FORWARD kernel option and make possible to turn on the related functionality in the runtime via the sysctl variable net.pfil.forward. It is turned off by default.
Sponsored by: Yandex LLC Discussed with: net@ MFC after: 2 weeks
|
#
241913 |
|
22-Oct-2012 |
glebius |
Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet.
After this change a packet processed by the stack isn't modified at all[2] except for TTL.
After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack.
[1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility.
[2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon.
Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>
|
#
241547 |
|
14-Oct-2012 |
glebius |
Fix a miss from r241344: in ip_mloopback() we need to go to net byte order prior to calling in_delayed_cksum().
Reported by: Olivier Cochard-Labbe <olivier cochard.me>
|
#
241344 |
|
08-Oct-2012 |
glebius |
After r241245 it appeared that in_delayed_cksum(), which still expects host byte order, was sometimes called with net byte order. Since we are moving towards net byte order throughout the stack, the function was converted to expect net byte order, and its consumers fixed appropriately: - ip_output(), ipfilter(4) not changed, since already call in_delayed_cksum() with header in net byte order. - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order there and back. - mrouting code and IPv6 ipsec now need to switch byte order there and back, but I hope, this is temporary solution. - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum(). - pf_route() catches up on r241245 changes to ip_output().
|
#
241245 |
|
06-Oct-2012 |
glebius |
A step in resolving mess with byte ordering for AF_INET. After this change:
- All packets in NETISR_IP queue are in net byte order. - ip_input() is entered in net byte order and converts packet to host byte order right _after_ processing pfil(9) hooks. - ip_output() is entered in host byte order and converts packet to net byte order right _before_ processing pfil(9) hooks. - ip_fragment() accepts and emits packet in net byte order. - ip_forward(), ip_mloopback() use host byte order (untouched actually). - ip_fastforward() no longer modifies packet at all (except ip_ttl). - Swapping of byte order there and back removed from the following modules: pf(4), ipfw(4), enc(4), if_bridge(4). - Swapping of byte order added to ipfilter(4), based on __FreeBSD_version - __FreeBSD_version bumped. - pfil(9) manual page updated.
Reviewed by: ray, luigi, eri, melifaro Tested by: glebius (LE), ray (BE)
|
#
238573 |
|
18-Jul-2012 |
glebius |
Plug a reference leak: before doing 'goto again' we need to unref ia->ia_ifa if there is any.
Submitted by: Andrey Zonov <andrey zonov.org>
|
#
238092 |
|
04-Jul-2012 |
glebius |
When ip_output()/ip6_output() is supplied a struct route *ro argument, it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning here: it may be supplied to provide route, and it may be supplied to store and return to caller the route that ip_output()/ip6_output() finds. In the latter case skipping FLOWTABLE lookup is pessimisation.
The difference between struct route filled by FLOWTABLE and filled by rtalloc() family is that the former doesn't hold a reference on its rtentry. Reference is hold by flow entry, and it is about to be released in future. Thus, route filled by FLOWTABLE shouldn't be passed to RTFREE() macro.
- Introduce new flag for struct route/route_in6, that marks route not holding a reference on rtentry. - Introduce new macro RO_RTFREE() that cleans up a struct route depending on its kind. - All callers to ip_output()/ip6_output() that do supply non-NULL but empty route should use RO_RTFREE() to free results of lookup. - ip_output()/ip6_output() now do FLOWTABLE lookup always when ro->ro_rt == NULL.
Tested by: tuexen (SCTP part)
|
#
236959 |
|
12-Jun-2012 |
tuexen |
Add a IP_RECVTOS socket option to receive for received UDP/IPv4 packets a cmsg of type IP_RECVTOS which contains the TOS byte. Much like IP_RECVTTL does for TTL. This allows to implement a protocol on top of UDP and implementing ECN.
MFC after: 3 days
|
#
227207 |
|
06-Nov-2011 |
trociny |
Cache SO_REUSEPORT socket option in inpcb-layer in order to avoid inp_socket->so_options dereference when we may not acquire the lock on the inpcb.
This fixes the crash due to NULL pointer dereference in in_pcbbind_setup() when inp_socket->so_options in a pcb returned by in_pcblookup_local() was checked.
Reported by: dave jones <s.dave.jones@gmail.com>, Arnaud Lacombe <lacombar@gmail.com> Suggested by: rwatson Glanced by: rwatson Tested by: dave jones <s.dave.jones@gmail.com>
|
#
220619 |
|
14-Apr-2011 |
bz |
The mbuf_frag_size always was and is file local and not queried from base user space tools via kvm. Mark it static.
MFC after: 3 days
|
#
216857 |
|
31-Dec-2010 |
bz |
Try to catch a possible divide-by-zero as early as possible if "mtu" is 0 (also test for negative MTUs if checking it anyway). An MTU of 0 is arguably a bug elsewhere, but this at least gives us some more debugging hints.
Sponsored by: ISPsystem (Early 2010) MFC after: 1 week
|
#
213101 |
|
24-Sep-2010 |
attilio |
IP_BINDANY is not correctly handled in getsockopt() case. Fix it by specifying the correct bits.
Sponsored by: Sandvine Incorporated Reviewed by: bz, emaste, rstone Obtained from: Sandvine Incorporated MFC after: 10 days
|
#
208553 |
|
25-May-2010 |
qingli |
This patch fixes the problem where proxy ARP entries cannot be added over the if_ng interface.
MFC after: 3 days
|
#
205104 |
|
12-Mar-2010 |
rrs |
The proper fix for the delayed SCTP checksum is to have the delayed function take an argument as to the offset to the SCTP header. This allows it to work for V4 and V6. This of course means changing all callers of the function to either pass the header len, if they have it, or create it (ip_hl << 2 or sizeof(ip6_hdr)). PR: 144529 MFC after: 2 weeks
|
#
205066 |
|
12-Mar-2010 |
kmacy |
- restructure flowtable to support ipv6 - add a name argument to flowtable_alloc for printing with ddb commands - extend ddb commands to print destination address or 4-tuples - don't parse ports in ulp header if FL_HASH_ALL is not passed - add kern_flowtable_insert to enable more generic use of flowtable (e.g. system calls for adding entries) - don't hash loopback addresses - cleanup whitespace - keep statistics per-cpu for per-cpu flowtables to avoid cache line contention - add sysctls to accumulate stats and report aggregate
MFC after: 7 days
|
#
204902 |
|
09-Mar-2010 |
qingli |
One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible.
The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole.
Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed.
MFC after: 5 days
|
#
201141 |
|
28-Dec-2009 |
bz |
Make the compiler happy after r201125: - + remove two unnecessary initializations in ip_output; + + remove one unnecessary initializations in ip_output;
|
#
201131 |
|
28-Dec-2009 |
luigi |
introduce a local variable rte acting as a cache of ro->ro_rt within ip_output, achieving (in random order of importance): - a reduction of the number of 'r's in the source code; - improved legibility; - a reduction of 64 bytes in the .text
|
#
201125 |
|
28-Dec-2009 |
luigi |
+ remove an unused #define print_ip; + remove two unnecessary initializations in ip_output; + localize 'len'; + introduce a temporary variable n to count the number of fragments, the compiler seems unable to identify a common subexpression (written 3 times, used twice); + document some assumptions on ip_len and ip_hl
|
#
199102 |
|
09-Nov-2009 |
trasz |
Remove ifdefed out part of code, which seems to have originated a decade ago in OpenBSD. As it is now, there is no way for this to be useful, since IPsec is free to forward packets via whatever interface it wants, so checking capabilities of the interface passed from ip_output (fetched from the routing table) serves no purpose.
Discussed with: sam@
|
#
197952 |
|
11-Oct-2009 |
julian |
Virtualize the pfil hooks so that different jails may chose different packet filters. ALso allows ipfw to be enabled on on ejail and disabled on another. In 8.0 it's a global setting.
Sitting aroung in tree waiting to commit for: 2 months MFC after: 2 months
|
#
196608 |
|
28-Aug-2009 |
qingli |
Do not try to free the rt_lle entry of the cached route in ip_output() if the cached route was not initialized from the flow-table. The rt_lle entry is invalid unless it has been initialized through the flow-table.
Reviewed by: kmacy, rwatson MFC after: immediately
|
#
196368 |
|
18-Aug-2009 |
kmacy |
- change the interface to flowtable_lookup so that we don't rely on the mbuf for obtaining the fib index - check that a cached flow corresponds to the same fib index as the packet for which we are doing the lookup - at interface detach time flush any flows referencing stale rtentrys associated with the interface that is going away (fixes reported panics) - reduce the time between cleans in case the cleaner is running at the time the eventhandler is called and the wakeup is missed less time will elapse before the eventhandler returns - separate per-vnet initialization from global initialization (pointed out by jeli@)
Reviewed by: sam@ Approved by: re@
|
#
196234 |
|
14-Aug-2009 |
qingli |
In function ip_output(), the cached route is flushed when there is a mismatch between the cached entry and the intended destination. The cached rtentry{} is flushed but the associated llentry{} is not. This causes the wrong destination MAC address being used in the output packets. The fix is to flush the llentry{} when rtentry{} is cleared.
Reviewed by: kmacy, rwatson Approved by: re
|
#
196019 |
|
01-Aug-2009 |
rwatson |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes.
Reviewed by: bz Approved by: re (vimage blanket)
|
#
195699 |
|
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
194760 |
|
23-Jun-2009 |
rwatson |
Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references:
ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr
Remove unused macro which didn't have required referencing:
IFP_TO_IA6
This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references.
Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed.
Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
|
#
194660 |
|
22-Jun-2009 |
zec |
V_irtualize flowtable state.
This change should make options VIMAGE kernel builds usable again, to some extent at least.
Note that the size of struct vnet_inet has changed, though in accordance with one-bump-per-day policy we didn't update the __FreeBSD_version number, given that it has already been touched by r194640 a few hours ago. Reviewed by: bz Approved by: julian (mentor)
|
#
194076 |
|
12-Jun-2009 |
bz |
Move the kernel option FLOWTABLE chacking from the header file to the actual implementation. Remove the accessor functions for the compiled out case, just returning "unavail" values. Remove the kernel conditional from the header file as it is no longer needed, only leaving the externs. Hide the improperly virtualized SYSCTL/TUNABLE for the flowtable size under the kernel option as well.
Reviewed by: rwatson
|
#
193550 |
|
05-Jun-2009 |
pjd |
Only four out of nine arguments for ip_ipsec_output() are actually used. Kill unused arguments except for 'ifp' as it might be used in the future for detecting IPsec-capable interfaces.
|
#
193511 |
|
05-Jun-2009 |
rwatson |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
|
#
193217 |
|
01-Jun-2009 |
pjd |
- Rename IP_NONLOCALOK IP socket option to IP_BINDANY, to be more consistent with OpenBSD (and BSD/OS originally). We can't easly do it SOL_SOCKET option as there is no more space for more SOL_SOCKET options, but this option also fits better as an IP socket option, it seems. - Implement this functionality also for IPv6 and RAW IP sockets. - Always compile it in (don't use additional kernel options). - Remove sysctl to turn this functionality on and off. - Introduce new privilege - PRIV_NETINET_BINDANY, which allows to use this functionality (currently only unjail root can use it).
Discussed with: julian, adrian, jhb, rwatson, kmacy
|
#
192528 |
|
21-May-2009 |
rwatson |
Consolidate and clean up the first section of ip_output.c in light of the last year or two's work on routing:
- Combine iproute initialization and flowtable lookup blocks, eliminating unnecessary tests for known-zero'd iproute fields.
- Add a comment indicating (a) why the route entry returned by the flowtable is considered stable and (b) that the flowtable lookup must occur after the setup of the mbuf flow ID.
- Assert the inpcb lock before any use of inpcb fields.
Reviewed by: kmacy
|
#
191621 |
|
28-Apr-2009 |
trasz |
Don't require packet to match a route (any route; this information wasn't used anyway, so a typical workaround was to add a dummy route) if it's going to be sent through IPSec tunnel.
Reviewed by: bz
|
#
191259 |
|
19-Apr-2009 |
kmacy |
- Allocate a small flowtable in ip_input.c (changeable by tuneable) - Use for accelerating ip_output
|
#
191148 |
|
16-Apr-2009 |
kmacy |
Change if_output to take a struct route as its fourth argument in order to allow passing a cached struct llentry * down to L2
Reviewed by: rwatson
|
#
190951 |
|
11-Apr-2009 |
rwatson |
Update stats in struct ipstat using four new macros, IPSTAT_ADD(), IPSTAT_INC(), IPSTAT_SUB(), and IPSTAT_DEC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures.
MFC after: 3 days
|
#
190880 |
|
10-Apr-2009 |
kmacy |
Import "flowid" support for serializing flows across transmit queues
Reviewed by: rwatson and jeli
|
#
189359 |
|
04-Mar-2009 |
bms |
In ip_output(), do not acquire the IN_MULTI_LOCK(), and do not attempt to perform a group lookup. This is a socket layer lock, and the bottom half of IP really has no business taking it.
Use the value of the in_mcast_loop sysctl to determine if we should loop back by default, in the absence of any multicast socket options. Because the check on group membership is now deferred to the input path, an m_copym() is now required.
This should increase multicast send performance where the source has not requested loopback, although this has not been benchmarked or measured.
It is also a necessary change for IN_MULTI_LOCK to become non-recursive, which is required in order to implement IGMPv3 in a thread-safe way.
|
#
189106 |
|
27-Feb-2009 |
bz |
For all files including net/vnet.h directly include opt_route.h and net/route.h.
Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.
We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong.
This does not change the list of files which depend on opt_route.h but we can identify them now more easily.
|
#
188306 |
|
08-Feb-2009 |
bz |
Try to remove/assimilate as much of formerly IPv4/6 specific (duplicate) code in sys/netipsec/ipsec.c and fold it into common, INET/6 independent functions.
The file local functions ipsec4_setspidx_inpcb() and ipsec6_setspidx_inpcb() were 1:1 identical after the change in r186528. Rename to ipsec_setspidx_inpcb() and remove the duplicate.
Public functions ipsec[46]_get_policy() were 1:1 identical. Remove one copy and merge in the factored out code from ipsec_get_policy() into the other. The public function left is now called ipsec_get_policy() and callers were adapted.
Public functions ipsec[46]_set_policy() were 1:1 identical. Rename file local ipsec_set_policy() function to ipsec_set_policy_internal(). Remove one copy of the public functions, rename the other to ipsec_set_policy() and adapt callers.
Public functions ipsec[46]_hdrsiz() were logically identical (ignoring one questionable assert in the v6 version). Rename the file local ipsec_hdrsiz() to ipsec_hdrsiz_internal(), the public function to ipsec_hdrsiz(), remove the duplicate copy and adapt the callers. The v6 version had been unused anyway. Cleanup comments.
Public functions ipsec[46]_in_reject() were logically identical apart from statistics. Move the common code into a file local ipsec46_in_reject() leaving vimage+statistics in small AF specific wrapper functions. Note: unfortunately we already have a public ipsec_in_reject().
Reviewed by: sam Discussed with: rwatson (renaming to *_internal) MFC after: 26 days X-MFC: keep wrapper functions for public symbols?
|
#
188066 |
|
03-Feb-2009 |
rrs |
Adds support for SCTP checksum offload. This means we, like TCP and UDP, move the checksum calculation into the IP routines when there is no hardware support we call into the normal SCTP checksum routine.
The next round of SCTP updates will use this functionality. Of course the IGB driver needs a few updates to support the new intel controller set that actually does SCTP csum offload too.
Reviewed by: gnn, rwatson, kmacy
|
#
186961 |
|
09-Jan-2009 |
adrian |
Fix indentation; add FALLTHROUGH.
Thanks Max!
|
#
186955 |
|
09-Jan-2009 |
adrian |
Implement a new IP option (not compiled/enabled by default) to allow applications to specify a non-local IP address when bind()'ing a socket to a local endpoint.
This allows applications to spoof the client IP address of connections if (obviously!) they somehow are able to receive the traffic normally destined to said clients.
This patch doesn't include any changes to ipfw or the bridging code to redirect the client traffic through the PCB checks so TCP gets a shot at it. The normal behaviour is that packets with a non-local destination IP address are not handled locally. This can be dealth with some IPFW hackery; modifications to IPFW to make this less hacky will occur in subsequent commmits.
Thanks to Julian Elischer and others at Ironport. This work was approved and donated before Cisco acquired them.
Obtained from: Julian Elischer and others MFC after: 2 weeks
|
#
186717 |
|
03-Jan-2009 |
rwatson |
Allow the IP_MINTTL socket option to be set to 0 so that it can be disabled entirely, which is its default state before set to a non-zero value.
PR: 128790 Submitted by: Nick Hilliard <nick at foobar dot org> MFC after: 3 weeks
|
#
186119 |
|
15-Dec-2008 |
qingli |
This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries.
Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion
|
#
185895 |
|
10-Dec-2008 |
zec |
Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option.
Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.
Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively
#ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif
Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs.
Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c.
Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS.
De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import.
Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions.
Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
185571 |
|
02-Dec-2008 |
bz |
Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
|
#
185348 |
|
26-Nov-2008 |
zec |
Merge more of currently non-functional (i.e. resolving to whitespace) macros from p4/vimage branch.
Do a better job at enclosing all instantiations of globals scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks.
De-virtualize and mark as const saorder_state_alive and saorder_state_any arrays from ipsec code, given that they are never updated at runtime, so virtualizing them would be pointless.
Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
185101 |
|
19-Nov-2008 |
julian |
Fix a scope problem in the multiple routing table code that stopped the SO_SETFIB socket option from working correctly.
Obtained from: Ironport MFC after: 3 days
|
#
185088 |
|
19-Nov-2008 |
zec |
Change the initialization methodology for global variables scheduled for virtualization.
Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks.
Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures.
Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
183550 |
|
02-Oct-2008 |
zec |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit
Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs.
Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().
Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.).
All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*).
(*) netipsec/keysock.c did not validate depending on compile time options.
Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
182463 |
|
29-Aug-2008 |
gnn |
Fix a bug whereby multicast packets that are looped back locally wind up with the incorrect checksum on the wire when transmitted via devices that do checksum offloading.
PR: kern/119635 Reviewed by: rwatson MFC after: 5 days
|
#
181966 |
|
21-Aug-2008 |
rwatson |
Remove comments and #ifdef notyet'd code relating to directly dispatching the IP multicast input code from the output path; we don't allow reentrance of the input path from the IP output path, it must use the netisr due to potential lock recursion.
MFC after: 3 days
|
#
181803 |
|
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
#
178888 |
|
09-May-2008 |
julian |
Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x)
Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux.
From my notes:
-----
One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address.
Constraints: ------------
I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need.
One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing".
One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch.
This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs.
To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family.
The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before.
The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row.
In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later.
One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically).
You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it.
This brings us as to how the correct FIB is selected for an outgoing IPV4 packet.
Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways.
Packets fall into one of a number of classes.
1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice..
setfib -3 ping target.example.com # will use fib 3 for ping.
It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands.
2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.)
3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2).
4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib.
5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to.
6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1.
Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented)
In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB.
In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process.
Early testing experience: -------------------------
Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks.
For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done.
Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly.
ipfw has grown 2 new keywords:
setfib N ip from anay to any count ip from any to any fib N
In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required.
SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something.
Where to next: --------------------
After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code.
Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code.
My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it.
When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry.
Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already.
This work was sponsored by Ironport Systems/Cisco
Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco
|
#
178319 |
|
19-Apr-2008 |
rwatson |
In ip_output(), allow a read lock as well as a write lock when asserting a lock on the passed inpcb.
MFC after: 3 months
|
#
178285 |
|
17-Apr-2008 |
rwatson |
Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive.
This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code.
MFC after: 3 months Tested by: kris (superset of committered patch)
|
#
178167 |
|
13-Apr-2008 |
qingli |
This patch provides the back end support for equal-cost multi-path (ECMP) for both IPv4 and IPv6. Previously, multipath route insertion is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1 route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of "add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat output:
default 10.2.5.1 UGS 0 3074 bge0 => default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires a specific gateway to be specified or else an error message would be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process" "delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single route for a particular destination.
I need to perform more testings on address aliases and multiple interfaces that have the same IP prefixes. This patch as it stands today is not yet ready for prime time. Therefore, the ECMP code fragments are fully guarded by the RADIX_MPATH macro. Include the "options RADIX_MPATH" in the kernel configuration to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
|
#
177599 |
|
25-Mar-2008 |
ru |
Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA.
Reviewed by: arch
There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.
|
#
175892 |
|
02-Feb-2008 |
bz |
Rather than passing around a cached 'priv', pass in an ucred to ipsec*_set_policy and do the privilege check only if needed.
Try to assimilate both ip*_ctloutput code blocks calling ipsec*_set_policy.
Reviewed by: rwatson
|
#
172930 |
|
24-Oct-2007 |
rwatson |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
#
172467 |
|
07-Oct-2007 |
silby |
Add FBSDID to all files in netinet so that people can more easily include file version information in bug reports.
Approved by: re (kensmith)
|
#
171167 |
|
03-Jul-2007 |
gnn |
Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC.
Approved by: re Sponsored by: Secure Computing
|
#
171133 |
|
01-Jul-2007 |
gnn |
Commit IPv6 support for FAST_IPSEC to the tree. This commit includes only the kernel files, the rest of the files will follow in a second commit.
Reviewed by: bz Approved by: re Supported by: Secure Computing
|
#
170613 |
|
12-Jun-2007 |
bms |
Import rewrite of IPv4 socket multicast layer to support source-specific and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work.
This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project.
The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html
Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING.
This work was financially supported by another FreeBSD committer.
Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)
|
#
169454 |
|
10-May-2007 |
rwatson |
Move universally to ANSI C function declarations, with relatively consistent style(9)-ish layout.
|
#
167831 |
|
23-Mar-2007 |
bms |
Purge two redundant case labels.
|
#
167141 |
|
01-Mar-2007 |
bms |
Fix undirected broadcast sends for the case where SO_DONTROUTE has also been set at the socket layer, in our somewhat convoluted IPv4 source selection logic in ip_output().
IP_ONESBCAST is actually a special case of SO_DONTROUTE, as 255.255.255.255 must always be delivered on a local link with a TTL of 1.
If IP_ONESBCAST has been set at the socket layer, also perform destination interface lookup for point-to-point interfaces based on the destination address of the link; previously it was not possible to use the option with such interfaces; also, the destination/broadcast address fields map to the same field within struct ifnet, which doesn't help matters.
One more valid fix going forward for these issues is to treat 255.255.255.255 as a destination in its own right in the forwarding trie. Other implementations do this. It fits with the use of multiple paths, though it then becomes necessary to specify interface preference. This hack will eventually go away when that comes to pass.
Reviewed by: andre MFC after: 1 week
|
#
165082 |
|
10-Dec-2006 |
bms |
Back out revision 1.264.
Fixing the IP accounting issue, if we plan to do so, needs to be better thought out; the 'fix' introduces a hash lookup and a possible kernel panic.
Reported by: Mark Tinguely
|
#
164033 |
|
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
163606 |
|
22-Oct-2006 |
rwatson |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
#
162798 |
|
29-Sep-2006 |
andre |
Remove stone-aged and irrelevant "#ifndef notdef".
|
#
162615 |
|
25-Sep-2006 |
bms |
Account for output IP datagrams on the ifaddr where they originated from, *not* the first ifaddr on the ifp. This is similar to what NetBSD does.
PR: kern/72936 Submitted by: alfred Reviewed by: andre
|
#
162231 |
|
11-Sep-2006 |
andre |
Fix a NULL pointer dereference of ro->ro_rt->rt_flags by checking for the validity of ro->ro_rt first. This prevents crashing on any non-normally routed IP packet.
Coverity CID: 162 (incorrectly, it was re-introduced by previous commit)
|
#
162205 |
|
10-Sep-2006 |
jmg |
make use of the host route's mtu for processing. This means we can now support a network w/ split mtu's by assigning each host route the correct mtu. an aspiring programmer could write a daemon to probe hosts and find out if they support a larger mtu.
|
#
162084 |
|
06-Sep-2006 |
andre |
First step of TSO (TCP segmentation offload) support in our network stack.
o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6 o add CSUM_TSO flag to mbuf pkthdr csum_flags field o add tso_segsz field to mbuf pkthdr o enhance ip_output() packet length check to allow for large TSO packets o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities o adjust all callers of tcp_maxmtu[46]() accordingly
Discussed on: -current, -net Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
162068 |
|
06-Sep-2006 |
andre |
Fix the socket option IP_ONESBCAST by giving it its own case in ip_output() and skip over the normal IP processing.
Add a supporting function ifa_ifwithbroadaddr() to verify and validate the supplied subnet broadcast address.
PR: kern/99558 Tested by: Andrey V. Elsukov <bu7cher-at-yandex.ru> Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
|
#
161380 |
|
17-Aug-2006 |
julian |
Remove the IPFIREWALL_FORWARD_EXTENDED option and make it on by default as it always was in older versions of FreeBSD. This option is pointless as it is needed in just about every interesting usage of forward that I have ever seen. It doesn't make the system any safer and just wastes huge amounts of develper time when the system doesn't behave as expected when code is moved from 4.x to 6.x It doesn't make the system any safer and just wastes huge amounts of develper time when the system doesn't behave as expected when code is moved from 4.x to 6.x or 7.x Reviewed by: glebius MFC after: 1 week
|
#
160027 |
|
29-Jun-2006 |
glebius |
Fix URL to Bellovin's paper.
Submitted by: Anton Yuzhaninov <citrin rambler-co.ru>
|
#
158799 |
|
21-May-2006 |
maxim |
o Add missed error check: in ip_ctloutput() sooptcopyin() returns a result but we never examine it.
Reviewed by: rwatson MFC after: 2 weeks
|
#
158563 |
|
14-May-2006 |
bms |
Fix a long-standing limitation in IPv4 multicast group membership.
By making the imo_membership array a dynamically allocated vector, this minimizes disruption to existing IPv4 multicast code. This change breaks the ABI for the kernel module ip_mroute.ko, and may cause a small amount of churn for folks working on the IGMPv3 merge.
Previously, sockets were subject to a compile-time limitation on the number of IPv4 group memberships, which was hard-coded to 20. The imo_membership relationship, however, is 1:1 with regards to a tuple of multicast group address and interface address. Users who ran routing protocols such as OSPF ran into this limitation on machines with a large system interface tree.
|
#
155201 |
|
02-Feb-2006 |
csjp |
Somewhat re-factor the read/write locking mechanism associated with the packet filtering mechanisms to use the new rwlock(9) locking API:
- Drop the variables stored in the phil_head structure which were specific to conditions and the home rolled read/write locking mechanism. - Drop some includes which were used for condition variables - Drop the inline functions, and convert them to macros. Also, move these macros into pfil.h - Move pfil list locking macros intp phil.h as well - Rename ph_busy_count to ph_nhooks. This variable will represent the number of IN/OUT hooks registered with the pfil head structure - Define PFIL_HOOKED macro which evaluates to true if there are any hooks to be ran by pfil_run_hooks - In the IP/IP6 stacks, change the ph_busy_count comparison to use the new PFIL_HOOKED macro. - Drop optimization in pfil_run_hooks which checks to see if there are any hooks to be ran, and returns if not. This check is already performed by the IP stacks when they call:
if (!PFIL_HOOKED(ph)) goto skip_hooks;
- Drop in assertion which makes sure that the number of hooks never drops below 0 for good measure. This in theory should never happen, and if it does than there are problems somewhere - Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep - Drop variables which support home rolled read/write locking mechanism from the IPFW firewall chain structure. - Swap out the read/write firewall chain lock internal to use the rwlock(9) API instead of our home rolled version - Convert the inlined functions to macros
Reviewed by: mlaier, andre, glebius Thanks to: jhb for the new locking API
|
#
155179 |
|
01-Feb-2006 |
andre |
Move the IPSEC related code blocks to their own file to unclutter and signifincantly improve the readability of ip_input() and ip_output() again.
The resulting IPSEC hooks in ip_input() and ip_output() may be used later on for making IPSEC loadable.
This move is mostly mechanical and should preserve current IPSEC behaviour as-is. Nothing shall prevent improvements in the way IPSEC interacts with the IPv4 stack.
Discussed with: bz, gnn, rwatson; (earlier version)
|
#
154526 |
|
18-Jan-2006 |
andre |
In in_delayed_cksum() we can't perform a m_pullup() as it may change the mbuf pointer and we don't have any way of passing it back to the callers. Instead just fail silently without updating the checksum but leaving the mbuf+chain intact.
A search in our GNATS database did not turn up any match for the existing warning message when this case is encountered.
Found by: Coverity Prevent(tm) Coverity ID: CID779 Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
|
#
154520 |
|
18-Jan-2006 |
andre |
Prevent dereferencing a NULL route pointer when trying to update the route MTU.
This bug is very difficult to reach and not remotely exploitable.
Found by: Coverity Prevent(tm) Coverity ID: CID162 Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
|
#
153164 |
|
06-Dec-2005 |
glebius |
When we drop packet due to no space in output interface output queue, also increase the ifp->if_snd.ifq_drops.
PR: 72440 Submitted by: ikob
|
#
152592 |
|
18-Nov-2005 |
andre |
Consolidate all IP Options handling functions into ip_options.[ch] and include ip_options.h into all files making use of IP Options functions.
From ip_input.c rev 1.306: ip_dooptions(struct mbuf *m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt)
From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb *inp, int optname, struct mbuf *m)
No functional changes in this commit.
Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
152583 |
|
18-Nov-2005 |
andre |
Purge layer specific mbuf flags on layer crossings to avoid confusing upper or lower layers.
Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
151967 |
|
02-Nov-2005 |
andre |
Retire MT_HEADER mbuf type and change its users to use MT_DATA.
Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag.
Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA.
Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
150594 |
|
26-Sep-2005 |
andre |
Implement IP_DONTFRAG IP socket option enabling the Don't Fragment flag on IP packets. Currently this option is only repected on udp and raw ip sockets. On tcp sockets the DF flag is controlled by the path MTU discovery option.
Sending a packet larger than the MTU size of the egress interface returns an EMSGSIZE error.
Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
149635 |
|
30-Aug-2005 |
andre |
Use the correct mbuf type for MGET().
|
#
149371 |
|
22-Aug-2005 |
andre |
Add socketoption IP_MINTTL. May be used to set the minimum acceptable TTL a packet must have when received on a socket. All packets with a lower TTL are silently dropped. Works on already connected/connecting and listening sockets for RAW/UDP/TCP.
This option is only really useful when set to 255 preventing packets from outside the directly connected networks reaching local listeners on sockets.
Allows userland implementation of 'The Generalized TTL Security Mechanism (GTSM)' according to RFC3682. Examples of such use include the Cisco IOS BGP implementation command "neighbor ttl-security".
MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
148903 |
|
09-Aug-2005 |
rwatson |
Add helper function ip_findmoptions(), which accepts an inpcb, and attempts to atomically return either an existing set of IP multicast options for the PCB, or a newlly allocated set with default values. The inpcb is returned locked. This function may sleep.
Call ip_moptions() to acquire a reference to a PCB's socket options, and perform the update of the options while holding the PCB lock. Release the lock before returning.
Remove garbage collection of multicast options when values return to the default, as this complicates locking substantially. Most applications allocate a socket either to be multicast, or not, and don't tend to keep around sockets that have previously been used for multicast, then used for unicast.
This closes a number of race conditions involving multiple threads or processes modifying the IP multicast state of a socket simultaenously.
MFC after: 7 days
|
#
148682 |
|
03-Aug-2005 |
rwatson |
Introduce in_multi_mtx, which will protect IPv4-layer multicast address lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case.
For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held.
Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output().
Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync.
Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events.
Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking.
Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days
|
#
147785 |
|
05-Jul-2005 |
rwatson |
Eliminate MAC entry point mac_create_mbuf_from_mbuf(), which is redundant with respect to existing mbuf copy label routines. Expose a new mac_copy_mbuf() routine at the top end of the Framework and use that; use the existing mpo_copy_mbuf_label() routine on the bottom end.
Obtained from: TrustedBSD Project Sponsored by: SPARTA, SPAWAR Approved by: re (scottl)
|
#
147256 |
|
10-Jun-2005 |
brooks |
Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go.
Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
|
#
142248 |
|
22-Feb-2005 |
andre |
Bring back the full packet destination manipulation for 'ipfw fwd' with the kernel compile time option:
options IPFIREWALL_FORWARD_EXTENDED
This option has to be specified in addition to IPFIRWALL_FORWARD.
With this option even packets targeted for an IP address local to the host can be redirected. All restrictions to ensure proper behaviour for locally generated packets are turned off. Firewall rules have to be carefully crafted to make sure that things like PMTU discovery do not break.
Document the two kernel options.
PR: kern/71910 PR: kern/73129 MFC after: 1 week
|
#
140675 |
|
23-Jan-2005 |
alc |
Correctly move the packet header in ip_insertoptions().
Reported by: Anupam Chanda Reviewed by: sam@ MFC after: 2 weeks
|
#
139823 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
139310 |
|
25-Dec-2004 |
rwatson |
Remove an errant blank line apparently introduced in ip_output.c:1.194.
|
#
138408 |
|
05-Dec-2004 |
rwatson |
Pass the inpcb reference into ip_getmoptions() rather than just the inp->inp_moptions pointer, so that ip_getmoptions() can perform necessary locking when doing non-atomic reads.
Lock the inpcb by default to copy any data to local variables, then unlock before performing sooptcopyout().
MFC after: 2 weeks
|
#
138404 |
|
05-Dec-2004 |
rwatson |
Push the inpcb argument into ip_setmoptions() when setting IP multicast socket options, so that it is available for locking.
|
#
138397 |
|
05-Dec-2004 |
rwatson |
Start working through inpcb locking for ip_ctloutput() by cleaning up modifications to the inpcb IP options mbuf:
- Lock the inpcb before passing it into ip_pcbopts() in order to prevent simulatenous reads and read-modify-writes that could result in races. - Pass the inpcb reference into ip_pcbopts() instead of the option chain pointer in the inpcb. - Assert the inpcb lock in ip_pcbots. - Convert one or two uses of a pointer as a boolean or an integer comparison to a comparison with NULL for readability.
|
#
135920 |
|
29-Sep-2004 |
mlaier |
Add an additional struct inpcb * argument to pfil(9) in order to enable passing along socket information. This is required to work around a LOR with the socket code which results in an easy reproducible hard lockup with debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do so later. The missing piece is to turn the filter locking into a leaf lock and will follow in a seperate (later) commit.
This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in forseeable future.
Suggested by: rwatson A lot of work by: csjp (he'd be even more helpful w/o mentor-reviews ;) Reviewed by: rwatson, csjp Tested by: -pf, -ipfw, LINT, csjp and myself MFC after: 3 days
LOR IDs: 14 - 17 (not fixed yet)
|
#
135160 |
|
13-Sep-2004 |
andre |
Make comments more clear for the packet changed cases after pfil hooks.
|
#
134852 |
|
06-Sep-2004 |
jmg |
revert comment from rev1.158 now that rev1.225 backed it out..
MFC after: 3 days
|
#
134385 |
|
27-Aug-2004 |
andre |
In the case the destination of a packet was changed by the packet filter to point to a local IP address; and the packet was sourced from this host we fill in the m_pkthdr.rcvif with a pointer to the loopback interface.
Before the function ifunit("lo0") was used to obtain the ifp. However this is sub-optimal from a performance point of view and might be dangerous if the loopback interface has been renamed. Use the global variable 'loif' instead which always points to the loopback interface.
Submitted by: brooks
|
#
134383 |
|
27-Aug-2004 |
andre |
Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack.
If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active.
Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
|
#
134172 |
|
22-Aug-2004 |
mlaier |
Allow early drop for non-ALTQ enabled queues in an ALTQ-enabled kernel. Previously the early drop was disabled unconditionally for ALTQ-enabled kernels.
This should give some benefit for the normal gateway + LAN-server case with a busy LAN leg and an ALTQ managed uplink.
Reviewed and style help from: cperciva, pjd
|
#
133922 |
|
18-Aug-2004 |
peter |
Make the kernel compile again if you are not using PFIL_HOOKS
|
#
133920 |
|
17-Aug-2004 |
andre |
Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different.
However there are many changes how ipfw is and its add-on's are handled:
In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler.
IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output().
ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it.
DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection.
BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS.
More detailed changes to the code:
conf/files Add netinet/ip_fw_pfil.c.
conf/options Add IPFIREWALL_FORWARD option.
modules/ipfw/Makefile Add ip_fw_pfil.c.
net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well.
netinet/ip_divert.c Removed divert_clone() function. It is no longer used.
netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed.
netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code.
netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args.
netinet/ip_fw2.c (Re)moved some global variables and the module handling.
netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization.
netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set.
netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code.
netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.)
netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active.
netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags.
netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here.
sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed.
Approved by: re (scottl)
|
#
133720 |
|
14-Aug-2004 |
dwmalone |
Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD have already done this, so I have styled the patch on their work:
1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID.
2) named the sysctl net.inet.ip.random_id
3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns.
The sysctl defaults to 0 (sequential IP IDs).
Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months
|
#
133481 |
|
11-Aug-2004 |
andre |
Consistently use NULL for pointer comparisons.
|
#
133389 |
|
09-Aug-2004 |
andre |
Make a comment that "ipfw forward" is not SMP and PREEMPTION safe.
|
#
133074 |
|
03-Aug-2004 |
andre |
o Delayed checksums are now calculated in divert_packet() for diverted packets Remove the XXX-escaped code that did it in ip_output()'s IPHACK section.
|
#
131012 |
|
24-Jun-2004 |
rwatson |
In ip_ctloutput(), acquire the inpcb lock around some of the basic inpcb flag and status updates.
|
#
130416 |
|
13-Jun-2004 |
mlaier |
Link ALTQ to the build and break with ABI for struct ifnet. Please recompile your (network) modules as well as any userland that might make sense of sizeof(struct ifnet). This does not change the queueing yet. These changes will follow in a seperate commit. Same with the driver changes, which need case by case evaluation.
__FreeBSD_version bump will follow.
Tested-by: (i386)LINT
|
#
129126 |
|
11-May-2004 |
maxim |
o Calculate a number of bytes to copy (cnt) correctly:
+----+-+-+-+-+----+----+- - - - - - - - - - - - -+----+ | | |C| | | | | | | | IP |N|O|L|P| | IP | | IP | | #1 |O|D|E|T| | #2 | | #n | | |P|E|N|R| | | | | +----+-+-+-+-+----+----+- - - - - - - - - - - - -+----+ ^ ^<---- cnt - (IPOPT_MINOFF - 1) ---->| | | src | +-- cp[IPOPT_OFF + 1] + sizeof(struct in_addr) | dst +-- cp[IPOPT_OFF + 1]
PR: kern/66386 Submitted by: Andrei Iltchenko MFC after: 3 weeks
|
#
128829 |
|
02-May-2004 |
darrenr |
Rename m_claim_next_hop() to m_claim_next(), as suggested by Max Laier.
|
#
128816 |
|
02-May-2004 |
darrenr |
Rename ip_claim_next_hop() to m_claim_next_hop(), give it an extra arg (the type of tag to claim) and push it out of ip_var.h into mbuf.h alongside all of the other macros that work ok mbuf's and tag's.
|
#
128210 |
|
14-Apr-2004 |
luigi |
In an effort to simplify the routing code, try to deprecate rtalloc() in favour of rtalloc_ign(), which is what would end up being called anyways.
There are 25 more instances of rtalloc() in net*/ and about 10 instances of rtalloc_ign()
|
#
128019 |
|
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
#
128003 |
|
07-Apr-2004 |
ru |
Fixed a bug in previous revision: compute the payload checksum before we convert ip_len into a network byte order; in_delayed_cksum() still expects it in host byte order.
The symtom was the ``in_cksum_skip: out of data by %d'' complaints from the kernel.
To add to the previous commit log. These fixes make tcpdump(1) happy by not complaining about UDP/TCP checksum being bad for looped back IP multicast when multicast router is deactivated.
Reported by: Vsevolod Lobko
|
#
127396 |
|
25-Mar-2004 |
ru |
Untangle IP multicast routing interaction with delayed payload checksums.
Compute the payload checksum for a locally originated IP multicast where God intended, in ip_mloopback(), rather than doing it in ip_output() and only when multicast router is active. This is more correct as we do not fool ip_input() that the packet has the correct payload checksum when in fact it does not (when multicast router is inactive). This is also more efficient if we don't join the multicast group we send to, thus allowing the hardware to checksum the payload.
|
#
126486 |
|
02-Mar-2004 |
mlaier |
Two minor follow-ups on the MT_TAG removal: ifp is now passed explicitly to ether_demux; no need to look it up again. Make mtag a global var in ip_input.
Noticed by: rwatson Approved by: bms(mentor)
|
#
126239 |
|
25-Feb-2004 |
mlaier |
Re-remove MT_TAGs. The problems with dummynet have been fixed now.
Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam
|
#
125952 |
|
18-Feb-2004 |
mlaier |
Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is not working properly with the patch in place.
Approved by: bms(mentor)
|
#
125875 |
|
16-Feb-2004 |
ume |
don't update outgoing ifp, if ipsec tunnel mode encapsulation was not made.
Obtained from: KAME
|
#
125784 |
|
13-Feb-2004 |
mlaier |
This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag).
This is (mostly) work from: sam
Silence from: -arch Approved by: bms(mentor), sam, rwatson
|
#
125680 |
|
11-Feb-2004 |
bms |
Initial import of RFC 2385 (TCP-MD5) digest support.
This is the first of two commits; bringing in the kernel support first. This can be enabled by compiling a kernel with options TCP_SIGNATURE and FAST_IPSEC.
For the uninitiated, this is a TCP option which provides for a means of authenticating TCP sessions which came into being before IPSEC. It is still relevant today, however, as it is used by many commercial router vendors, particularly with BGP, and as such has become a requirement for interconnect at many major Internet points of presence.
Several parts of the TCP and IP headers, including the segment payload, are digested with MD5, including a shared secret. The PF_KEY interface is used to manage the secrets using security associations in the SADB.
There is a limitation here in that as there is no way to map a TCP flow per-port back to an SPI without polluting tcpcb or using the SPD; the code to do the latter is unstable at this time. Therefore this code only supports per-host keying granularity.
Whilst FAST_IPSEC is mutually exclusive with KAME IPSEC (and thus IPv6), TCP_SIGNATURE applies only to IPv4. For the vast majority of prospective users of this feature, this will not pose any problem.
This implementation is output-only; that is, the option is honoured when responding to a host initiating a TCP session, but no effort is made [yet] to authenticate inbound traffic. This is, however, sufficient to interwork with Cisco equipment.
Tested with a Cisco 2501 running IOS 12.0(27), and Quagga 0.96.4 with local patches. Patches for tcpdump to validate TCP-MD5 sessions are also available from me upon request.
Sponsored by: sentex.net
|
#
125396 |
|
03-Feb-2004 |
ume |
pass pcb rather than so. it is expected that per socket policy works again.
|
#
124247 |
|
08-Jan-2004 |
andre |
Do not set the ip_id to zero when DF is set on packet and restore the general pre-randomid behaviour.
Setting the ip_id to zero causes several problems with packet reassembly when a device along the path removes the DF bit for some reason.
Other BSD and Linux have found and fixed the same issues.
PR: kern/60889 Tested by: Richard Wendland <richard@wendland.org.uk> Approved by: re (scottl)
|
#
122922 |
|
20-Nov-2003 |
andre |
Introduce tcp_hostcache and remove the tcp specific metrics from the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache.
It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve.
tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address.
It removes significant locking requirements from the tcp stack with regard to the routing table.
Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
|
#
122921 |
|
20-Nov-2003 |
andre |
Remove RTF_PRCLONING from routing table and adjust users of it accordingly. The define is left intact for ABI compatibility with userland.
This is a pre-step for the introduction of tcp_hostcache. The network stack remains fully useable with this change.
Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
|
#
122708 |
|
14-Nov-2003 |
andre |
Remove the global one-level rtcache variable and associated complex locking and rework ip_rtaddr() to do its own rtlookup. Adopt all its callers to this and make ip_output() callable with NULL rt pointer.
Reviewed by: sam (mentor)
|
#
122702 |
|
14-Nov-2003 |
andre |
Introduce ip_fastforward and remove ip_flow.
Short description of ip_fastforward:
o adds full direct process-to-completion IPv4 forwarding code o handles ip fragmentation incl. hw support (ip_flow did not) o sends icmp needfrag to source if DF is set (ip_flow did not) o supports ipfw and ipfilter (ip_flow did not) o supports divert, ipfw fwd and ipfilter nat (ip_flow did not) o returns anything it can't handle back to normal ip_input
Enable with sysctl -w net.inet.ip.fastforwarding=1
Reviewed by: sam (mentor)
|
#
122588 |
|
12-Nov-2003 |
andre |
Do not fragment a packet with hardware assistance if it has the DF bit set.
Reviewed by: sam (mentor)
|
#
122330 |
|
08-Nov-2003 |
sam |
assert optional inpcb is passed in locked
Supported by: FreeBSD Foundation
|
#
122062 |
|
04-Nov-2003 |
ume |
- cleanup SP refcnt issue. - share policy-on-socket for listening socket. - don't copy policy-on-socket at all. secpolicy no longer contain spidx, which saves a lot of memory. - deep-copy pcb policy if it is an ipsec policy. assign ID field to all SPD entries. make it possible for racoon to grab SPD entry on pcb. - fixed the order of searching SA table for packets. - fixed to get a security association header. a mode is always needed to compare them. - fixed that the incorrect time was set to sadb_comb_{hard|soft}_usetime. - disallow port spec for tunnel mode policy (as we don't reassemble). - an user can define a policy-id. - clear enc/auth key before freeing. - fixed that the kernel crashed when key_spdacquire() was called because key_spdacquire() had been implemented imcopletely. - preparation for 64bit sequence number. - maintain ordered list of SA, based on SA id. - cleanup secasvar management; refcnt is key.c responsibility; alloc/free is keydb.c responsibility. - cleanup, avoid double-loop. - use hash for spi-based lookup. - mark persistent SP "persistent". XXX in theory refcnt should do the right thing, however, we have "spdflush" which would touch all SPs. another solution would be to de-register persistent SPs from sptree. - u_short -> u_int16_t - reduce kernel stack usage by auto variable secasindex. - clarify function name confusion. ipsec_*_policy -> ipsec_*_pcbpolicy. - avoid variable name confusion. (struct inpcbpolicy *)pcb_sp, spp (struct secpolicy **), sp (struct secpolicy *) - count number of ipsec encapsulations on ipsec4_output, so that we can tell ip_output() how to handle the packet further. - When the value of the ul_proto is ICMP or ICMPV6, the port field in "src" of the spidx specifies ICMP type, and the port field in "dst" of the spidx specifies ICMP code. - avoid from applying IPsec transport mode to the packets when the kernel forwards the packets.
Tested by: nork Obtained from: KAME
|
#
121972 |
|
03-Nov-2003 |
rwatson |
Note that when ip_output() is called from ip_forward(), it will already have its options inserted, so the opt argument to ip_output() must be NULL.
|
#
120727 |
|
04-Oct-2003 |
sam |
Locking for updates to routing table entries. Each rtentry gets a mutex that covers updates to the contents. Note this is separate from holding a reference and/or locking the routing table itself.
Other/related changes:
o rtredirect loses the final parameter by which an rtentry reference may be returned; this was never used and added unwarranted complexity for locking. o minor style cleanups to routing code (e.g. ansi-fy function decls) o remove the logic to bump the refcnt on the parent of cloned routes, we assume the parent will remain as long as the clone; doing this avoids a circularity in locking during delete o convert some timeouts to MPSAFE callouts
Notes:
1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level applications cannot/do-no know about mutex's. Doing this requires that the mutex be the last element in the structure. A better solution is to introduce an externalized version of struct rtentry but this is a major task because of the intertwining of rtentry and other data structures that are visible to user applications. 2. There are known LOR's that are expected to go away with forthcoming work to eliminate many held references. If not these will be resolved prior to release. 3. ATM changes are untested.
Sponsored by: FreeBSD Foundation Obtained from: BSD/OS (partly)
|
#
120386 |
|
23-Sep-2003 |
sam |
o update PFIL_HOOKS support to current API used by netbsd o revamp IPv4+IPv6+bridge usage to match API changes o remove pfil_head instances from protosw entries (no longer used) o add locking o bump FreeBSD version for 3rd party modules
Heavy lifting by: "Max Laier" <max@love2party.net> Supported by: FreeBSD Foundation Obtained from: NetBSD (bits of pfil.h and pfil.c)
|
#
119644 |
|
01-Sep-2003 |
silby |
Implement MBUF_STRESS_TEST mark II.
Changes from the original implementation:
- Fragmentation is handled by the function m_fragment, which can be called from whereever fragmentation is needed. Note that this function is wrapped in #ifdef MBUF_STRESS_TEST to discourage non-testing use.
- m_fragment works slightly differently from the old fragmentation code in that it allocates a seperate mbuf cluster for each fragment. This defeats dma_map_load_mbuf/buffer's feature of coalescing adjacent fragments. While that is a nice feature in practice, it nerfed the usefulness of mbuf_stress_test.
- Add two modes of random fragmentation. Chains with fragments all of the same random length and chains with fragments that are each uniquely random in length may now be requested.
|
#
119178 |
|
20-Aug-2003 |
bms |
Add the IP_ONESBCAST option, to enable undirected IP broadcasts to be sent on specific interfaces. This is required by aodvd, and may in future help us in getting rid of the requirement for BPF from our import of isc-dhcp.
Suggested by: fenestro Obtained from: BSD/OS Reviewed by: mini, sam Approved by: jake (mentor)
|
#
118622 |
|
07-Aug-2003 |
hsu |
1. Basic PIM kernel support Disabled by default. To enable it, the new "options PIM" must be added to the kernel configuration file (in addition to MROUTING):
options MROUTING # Multicast routing options PIM # Protocol Independent Multicast
2. Add support for advanced multicast API setup/configuration and extensibility.
3. Add support for kernel-level PIM Register encapsulation. Disabled by default. Can be enabled by the advanced multicast API.
4. Implement a mechanism for "multicast bandwidth monitoring and upcalls".
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
|
#
117765 |
|
19-Jul-2003 |
silby |
Minor fix to the MBUF_STRESS_TEST code so that it keeps pkthdr.len consistant at all times. (Some debugging code I'm working on is tripped otherwise.)
MFC after: 3 days
|
#
115471 |
|
31-May-2003 |
wollman |
Don't generate an ip_id for packets with the DF bit set; ip_id is only meaningful for fragments. Also don't bother to byte-swap the ip_id when we do generate it; it is only used at the receiver as a nonce. I tried several different permutations of this code with no measurable difference to each other or to the unmodified version, so I've settled on the one for which gcc seems to generate the best code. (If anyone cares to microoptimize this differently for an architecture where it actually matters, feel free.)
Suggested by: Steve Bellovin's paper in IMW'02
|
#
114258 |
|
29-Apr-2003 |
mdodd |
IP_RECVTTL socket option.
Reviewed by: Stuart Cheshire <cheshire@apple.com>
|
#
113384 |
|
12-Apr-2003 |
silby |
Rename MBUF_FRAG_TEST to MBUF_STRESS_TEST as it will be extended to include more than just frag tests.
|
#
113255 |
|
08-Apr-2003 |
des |
Introduce an M_ASSERTPKTHDR() macro which performs the very common task of asserting that an mbuf has a packet header. Use it instead of hand- rolled versions wherever applicable.
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
|
#
113074 |
|
04-Apr-2003 |
des |
Replace memcpy() and ovbcopy() with bcopy(); ditch some caddr_t usage.
|
#
112985 |
|
02-Apr-2003 |
mdodd |
Back out support for RFC3514.
RFC3514 poses an unacceptale risk to compliant systems.
|
#
112983 |
|
02-Apr-2003 |
mdodd |
- Use the correct constant define. - Add a missing break.
|
#
112973 |
|
02-Apr-2003 |
mdodd |
Sync constant define with NetBSD.
Requested by: Tom Spindler <dogcow@babymeat.com>
|
#
112929 |
|
01-Apr-2003 |
mdodd |
Implement support for RFC 3514 (The Security Flag in the IPv4 Header). (See: ftp://ftp.rfc-editor.org/in-notes/rfc3514.txt)
This fulfills the host requirements for userland support by way of the setsockopt() IP_EVIL_INTENT message.
There are three sysctl tunables provided to govern system behavior.
net.inet.ip.rfc3514:
Enables support for rfc3514. As this is an Informational RFC and support is not yet widespread this option is disabled by default.
net.inet.ip.hear_no_evil
If set the host will discard all received evil packets.
net.inet.ip.speak_no_evil
If set the host will discard all transmitted evil packets.
The IP statistics counter 'ips_evil' (available via 'netstat') provides information on the number of 'evil' packets recieved.
For reference, the '-E' option to 'ping' has been provided to demonstrate and test the implementation.
|
#
112650 |
|
25-Mar-2003 |
mux |
Try to make the MBUF_FRAG_TEST code work better.
- Don't try to fragment the packet if it's smaller than mbuf_frag_size. - Preserve the size of the mbuf chain which is modified by m_split(). - Check that m_split() didn't return NULL. - Make it so we don't end up with two M_PKTHDR mbuf in the chain. - Use m->m_pkthdr.len instead of m->m_len so that we fragment the whole chain and not just the first mbuf. - Fix a nearby style bug and rework the logic of the loops so that it's more clear.
This is still not quite right, because we're clearly abusing m_split() to do something it was not designed for, but at least it works now. We should probably move this code into a m_fragment() function when it's correct.
|
#
112591 |
|
25-Mar-2003 |
silby |
Add the MBUF_FRAG_TEST option. When compiled in, this option allows you to tell ip_output to fragment all outgoing packets into mbuf fragments of size net.inet.ip.mbuf_frag_size bytes. This is an excellent way to test if network drivers can properly handle long mbuf chains being passed to them.
net.inet.ip.mbuf_frag_size defaults to 0 (no fragmentation) so that you can at least boot before your network driver dies. :)
|
#
111186 |
|
20-Feb-2003 |
jlemon |
Remove unused variables in the IPSEC case.
Submitted by: Lars Eggert <larse@ISI.EDU>
|
#
111145 |
|
19-Feb-2003 |
jlemon |
Add a TCP TIMEWAIT state which uses less space than a fullblown TCP control block. Allow the socket and tcpcb structures to be freed earlier than inpcb. Update code to understand an inp w/o a socket.
Reviewed by: hsu, silby, jayanth Sponsored by: DARPA, NAI Labs
|
#
111119 |
|
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
#
110074 |
|
30-Jan-2003 |
sam |
FAST_IPSEC bandaid: act like KAME and ignore ENOENT error codes from ipsec4_process_packet; they happen when a packet is dropped because an SA acquire is initiated
Submitted by: Doug Ambrisko <ambrisko@verniernetworks.com>
|
#
109623 |
|
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
108533 |
|
01-Jan-2003 |
schweikh |
Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.
|
#
107112 |
|
20-Nov-2002 |
luigi |
Back out the ip_fragment() code -- it is not urgent to have it in now, I will put it back in in a better form after 5.0 is out.
Requested by: sam, rwatson, luigi (on second thought) Approved by: re
|
#
107020 |
|
17-Nov-2002 |
luigi |
Move the ip_fragment code from ip_output() to a separate function, so that it can be reused elsewhere (there is a number of places where it can be useful). This also trims some 200 lines from the body of ip_output(), which helps readability a bit.
(This change was discussed a few weeks ago on the mailing lists, Julian agreed, silence from others. It is not a functional change, so i expect it to be ok to commit it now but i am happy to back it out if there are objections).
While at it, fix some function headers and replace m_copy() with m_copypacket() where applicable.
MFC after: 1 week
|
#
106968 |
|
15-Nov-2002 |
luigi |
Massive cleanup of the ip_mroute code.
No functional changes, but:
+ the mrouting module now should behave the same as the compiled-in version (it did not before, some of the rsvp code was not loaded properly); + netinet/ip_mroute.c is now truly optional; + removed some redundant/unused code; + changed many instances of '0' to NULL and INADDR_ANY as appropriate; + removed several static variables to make the code more SMP-friendly; + fixed some minor bugs in the mrouting code (mostly, incorrect return values from functions).
This commit is also a prerequisite to the addition of support for PIM, which i would like to put in before DP2 (it does not change any of the existing APIs, anyways).
Note, in the process we found out that some device drivers fail to properly handle changes in IFF_ALLMULTI, leading to interesting behaviour when a multicast router is started. This bug is not corrected by this commit, and will be fixed with a separate commit.
Detailed changes: -------------------- netinet/ip_mroute.c all the above. conf/files make ip_mroute.c optional net/route.c fix mrt_ioctl hook netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here together with other rsvp code, and a couple of indentation fixes. netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks netinet/ip_var.h rsvp function hooks netinet/raw_ip.c hooks for mrouting and rsvp functions, plus interface cleanup. netinet/ip_mroute.h remove an unused and optional field from a struct
Most of the code is from Pavlin Radoslavov and the XORP project
Reviewed by: sam MFC after: 1 week
|
#
106678 |
|
08-Nov-2002 |
sam |
correct fast ipsec logic: compare destination ip address against the contents of the SA, not the SP
Submitted by: "Doug Ambrisko" <ambrisko@verniernetworks.com>
|
#
105586 |
|
20-Oct-2002 |
phk |
Fix two instances of variant struct definitions in sys/netinet:
Remove the never completed _IP_VHL version, it has not caught on anywhere and it would make us incompatible with other BSD netstacks to retain this version.
Add a CTASSERT protecting sizeof(struct ip) == 20.
Don't let the size of struct ipq depend on the IPDIVERT option.
This is a functional no-op commit.
Approved by: re
|
#
105199 |
|
16-Oct-2002 |
sam |
Tie new "Fast IPsec" code into the build. This involves the usual configuration stuff as well as conditional code in the IPv4 and IPv6 areas. Everything is conditional on FAST_IPSEC which is mutually exclusive with IPSEC (KAME IPsec implmentation).
As noted previously, don't use FAST_IPSEC with INET6 at the moment.
Reviewed by: KAME, rwatson Approved by: silence Supported by: Vernier Networks
|
#
105194 |
|
16-Oct-2002 |
sam |
Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version
Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month
|
#
103852 |
|
23-Sep-2002 |
maxim |
Slightly rearrange a code in rev. 1.164:
o Move len initialization closer to place of its first usage. o Compare len with 0 to improve readability. o Explicitly zero out phlen in ip_insertoptions() in failure case.
Suggested by: jhb Reviewed by: jhb MFC after: 2 weeks
|
#
103478 |
|
17-Sep-2002 |
maxim |
In rare cases when there is no room for ip options ip_insertoptions() can fail and corrupt a header length. Initialize len and check what ip_insertoptions() returns.
Reviewed by: archie, silence on -net MFC after: 5 days
|
#
101096 |
|
31-Jul-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
When fragmenting an IP datagram, invoke an appropriate MAC entry point so that MAC labels may be copied (...) to the individual IP fragment mbufs by MAC policies.
When IP options are inserted into an IP datagram when leaving a host, preserve the label if we need to reallocate the mbuf for alignment or size reasons.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
99891 |
|
12-Jul-2002 |
luigi |
Avoid dereferencing a null pointer in ro_rt.
This was always broken in HEAD (the offending statement was introduced in rev. 1.123 for HEAD, while RELENG_4 included this fix (in rev. 1.99.2.12 for RELENG_4) and I inadvertently deleted it in 1.99.2.30.
So I am also restoring these two lines in RELENG_4 now. We might need another few things from 1.99.2.30.
|
#
98904 |
|
27-Jun-2002 |
mux |
Warning fixes for 64 bits platforms. With this last fix, I can build a GENERIC sparc64 kernel with -Werror.
Reviewed by: luigi
|
#
98849 |
|
26-Jun-2002 |
ken |
At long last, commit the zero copy sockets code.
MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes.
ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls.
man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links.
jumbo.9: New man page describing the jumbo buffer allocator interface and operation.
zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality.
NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.
conf/files: Add uipc_jumbo.c and uipc_cow.c.
conf/options: Add the 5 options mentioned above.
kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1.
uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack.
uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive.
uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on.
Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions.
uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c)
if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails.
The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)).
ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives.
if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers.
Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface.
Add header splitting support to the ti(4) driver.
Tweak some of the default interrupt coalescing parameters to more useful defaults.
Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off.
if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13.
Add defines needed for debugging.
Remove the ti_stats structure, it is now defined in sys/tiio.h.
ti_fw.h: 12.4.11 firmware.
ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.)
sys/jumbo.h: Jumbo buffer allocator interface.
sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process.
socketvar.h: Add prototype for socow_setup.
tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions.
uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable.
ufs_readwrite.c:Update for new prototype of uiomoveco().
vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault.
vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre.
This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.)
vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK.
vm_object.h: Add prototype for vm_object_allocate_wait().
vm_page.c: Add page-based copy on write setup, clear and fault routines.
vm_page.h: Add page based COW function prototypes and variable in the vm_page structure.
Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.
|
#
98666 |
|
23-Jun-2002 |
luigi |
fix bad indentation and whitespace resulting from cut&paste
|
#
98613 |
|
22-Jun-2002 |
luigi |
Remove (almost all) global variables that were used to hold packet forwarding state ("annotations") during ip processing. The code is considerably cleaner now.
The variables removed by this change are:
ip_divert_cookie used by divert sockets ip_fw_fwd_addr used for transparent ip redirection last_pkt used by dynamic pipes in dummynet
Removal of the first two has been done by carrying the annotations into volatile structs prepended to the mbuf chains, and adding appropriate code to add/remove annotations in the routines which make use of them, i.e. ip_input(), ip_output(), tcp_input(), bdg_forward(), ether_demux(), ether_output_frame(), div_output().
On passing, remove a bug in divert handling of fragmented packet. Now it is the fragment at offset 0 which sets the divert status of the whole packet, whereas formerly it was the last incoming fragment to decide.
Removal of last_pkt required a change in the interface of ip_fw_chk() and dummynet_io(). On passing, use the same mechanism for dummynet annotations and for divert/forward annotations.
option IPFIREWALL_FORWARD is effectively useless, the code to implement it is very small and is now in by default to avoid the obfuscation of conditionally compiled code.
NOTES: * there is at least one global variable left, sro_fwd, in ip_output(). I am not sure if/how this can be removed.
* I have deliberately avoided gratuitous style changes in this commit to avoid cluttering the diffs. Minor stule cleanup will likely be necessary
* this commit only focused on the IP layer. I am sure there is a number of global variables used in the TCP and maybe UDP stack.
* despite the number of files touched, there are absolutely no API's or data structures changed by this commit (except the interfaces of ip_fw_chk() and dummynet_io(), which are internal anyways), so an MFC is quite safe and unintrusive (and desirable, given the improved readability of the code).
MFC after: 10 days
|
#
97074 |
|
21-May-2002 |
arr |
- Change the newly turned INVARIANTS #ifdef blocks (they were changed from DIAGNOSTIC yesterday) into KASSERT()'s as these help to increase code readability.
|
#
97020 |
|
20-May-2002 |
arr |
- Turn a few DIAGNOSTIC into INVARIANTS since they are really sanity checks.
|
#
96245 |
|
09-May-2002 |
luigi |
Cleanup the interface to ip_fw_chk, two of the input arguments were totally useless and have been removed.
ip_input.c, ip_output.c: Properly initialize the "ip" pointer in case the firewall does an m_pullup() on the packet.
Remove some debugging code forgotten long ago.
ip_fw.[ch], bridge.c: Prepare the grounds for matching MAC header fields in bridged packets, so we can have 'etherfw' functionality without a lot of kernel and userland bloat.
|
#
93593 |
|
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
#
92960 |
|
22-Mar-2002 |
ru |
Prevent icmp_reflect() from calling ip_output() with a NULL route pointer which will then result in the allocated route's reference count never being decremented. Just flood ping the localhost and watch refcnt of the 127.0.0.1 route with netstat(1).
Submitted by: jayanth
Back out ip_output.c,v 1.143 and ip_mroute.c,v 1.69 that allowed ip_output() to be called with a NULL route pointer. The previous paragraph shows why this was a bad idea in the first place.
MFC after: 0 days
|
#
92723 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
90868 |
|
18-Feb-2002 |
mike |
o Move NTOHL() and associated macros into <sys/param.h>. These are deprecated in favor of the POSIX-defined lowercase variants. o Change all occurrences of NTOHL() and associated marcros in the source tree to use the lowercase function variants. o Add missing license bits to sparc64's <machine/endian.h>. Approved by: jake o Clean up <machine/endian.h> files. o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>. o Remove prototypes for non-existent bswapXX() functions. o Include <machine/endian.h> in <arpa/inet.h> to define the POSIX-required ntohl() family of functions. o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>, and <sys/param.h>. o Prepend underscores to the ntohl() family to help deal with complexities associated with having MD (asm and inline) versions, and having to prevent exposure of these functions in other headers that happen to make use of endian-specific defines. o Create weak aliases to the canonical function name to help deal with third-party software forgetting to include an appropriate header. o Remove some now unneeded pollution from <sys/types.h>. o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386 Reviewed by: bde, jake, tmm
|
#
90698 |
|
15-Feb-2002 |
ru |
Moved the 127/8 check below so that IPF redirects have a chance of working.
MFC after: 1 day
|
#
89624 |
|
21-Jan-2002 |
ume |
- Check the address family of the destination cached in a PCB. - Clear the cached destination before getting another cached route. Otherwise, garbage in the padding space (which might be filled in if it was used for IPv4) could annoy rtalloc.
Obtained from: KAME
|
#
89614 |
|
21-Jan-2002 |
ru |
RFC1122 requires that addresses of the form { 127, <any> } MUST NOT appear outside a host.
PR: 30792, 33996 Obtained from: ip_input.c MFC after: 1 week
|
#
88931 |
|
05-Jan-2002 |
fenner |
Pre-calculate the checksum for multicast packets sourced on a multicast router. This is overkill; it should be possible to delay to hardware interfaces and only pre-calculate when forwarding to a tunnel.
|
#
88593 |
|
28-Dec-2001 |
julian |
Fix ipfw fwd so that it acts as the docs say when forwarding an incoming packet to another machine.
Obtained from: Vicor Production tree MFC after: 3 weeks
|
#
88190 |
|
19-Dec-2001 |
yar |
Don't try to free a NULL route when doing IPFIREWALL_FORWARD. An old route will be NULL at that point if a packet were initially routed to an interface (using the IP_ROUTETOIF flag.)
Submitted by: Igor Timkin <ivt@gamma.ru>
|
#
87916 |
|
14-Dec-2001 |
jlemon |
whitespace and style fixes recovered from -stable.
|
#
87167 |
|
01-Dec-2001 |
ru |
Allow for ip_output() to be called with a NULL route pointer. This fixes a panic I introduced yesterday in ip_icmp.c,v 1.64.
|
#
86047 |
|
04-Nov-2001 |
luigi |
MFS: sync the ipfw/dummynet/bridge code with the one recently merged into stable (mostly , but not only, formatting and comments changes).
|
#
85741 |
|
30-Oct-2001 |
wpaul |
Fix a (long standing?) bug in ip_output(): if ip_insertoptions() is called and ip_output() encounters an error and bails (i.e. host unreachable), we will leak an mbuf. This is because the code calls m_freem(m0) after jumping to the bad: label at the end of the function, when it should be calling m_freem(m). (m0 is the original mbuf list _without_ the options mbuf prepended.)
Obtained from: NetBSD
|
#
85732 |
|
30-Oct-2001 |
jlemon |
When dropping a packet because there is no room in the queue (which itself is somewhat bogus), update the statistics to indicate something was dropped.
PR: 13740
|
#
84516 |
|
05-Oct-2001 |
ps |
Make it so dummynet and bridge can be loaded as modules.
Submitted by: billf
|
#
84102 |
|
29-Sep-2001 |
jlemon |
Add a hash table that contains the list of internet addresses, and use this in place of the in_ifaddr list when appropriate. This improves performance on hosts which have a large number of IP aliases.
|
#
84101 |
|
29-Sep-2001 |
jlemon |
Centralize satosin(), sintosa() and ifatoia() macros in <netinet/in.h> Remove local definitions.
|
#
84058 |
|
27-Sep-2001 |
luigi |
Two main changes here: + implement "limit" rules, which permit to limit the number of sessions between certain host pairs (according to masks). These are a special type of stateful rules, which might be of interest in some cases. See the ipfw manpage for details.
+ merge the list pointers and ipfw rule descriptors in the kernel, so the code is smaller, faster and more readable. This patch basically consists in replacing "foo->rule->bar" with "rule->bar" all over the place. I have been willing to do this for ages!
MFC after: 1 week
|
#
83934 |
|
25-Sep-2001 |
brooks |
Make faith loadable, unloadable, and clonable.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
83130 |
|
06-Sep-2001 |
jlemon |
Wrap array accesses in macros, which also happen to be lvalues:
ifnet_addrs[i - 1] -> ifaddr_byindex(i) ifindex2ifnet[i] -> ifnet_byindex(i)
This is intended to ease the conversion to SMPng.
|
#
81111 |
|
03-Aug-2001 |
dcs |
MFS: Avoid dropping fragments in the absence of an interface address.
Noticed by: fenner Submitted by: iedowse Not committed to current by: iedowse ;-)
|
#
80211 |
|
23-Jul-2001 |
ru |
Avoid a NULL pointer derefence introduced in rev. 1.129.
Problem noticed by: bde, gcc(1) Panic caught by: mjacob Patch tested by: mjacob
|
#
79934 |
|
19-Jul-2001 |
ru |
Backout non-functional changes from revision 1.128.
Not objected to by: dcs
|
#
79830 |
|
17-Jul-2001 |
dcs |
Skip the route checking in the case of multicast packets with known interfaces.
Reviewed by: people at that channel Approved by: silence on -net
|
#
78064 |
|
11-Jun-2001 |
ume |
Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge.
TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT.
Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks
|
#
77574 |
|
01-Jun-2001 |
kris |
Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets. This closes a minor information leak which allows a remote observer to determine the rate at which the machine is generating packets, since the default behaviour is to increment a counter for each packet sent.
Reviewed by: -net Obtained from: OpenBSD
|
#
74213 |
|
13-Mar-2001 |
ru |
RFC768 (UDP) requires that "if the computed checksum is zero, it is transmitted as all ones". This got broken after introduction of delayed checksums as follows. Some guys (including Jonathan) think that it is allowed to transmit all ones in place of a zero checksum for TCP the same way as for UDP. (The discussion still takes place on -net.) Thus, the 0 -> 0xffff checksum fixup was first moved from udp_output() (see udp_usrreq.c, 1.64 -> 1.65) to in_cksum_skip() (see sys/i386/i386/in_cksum.c, 1.17 -> 1.18, INVERT expression). Besides that I disagree that it is valid for TCP, there was no real problem until in_cksum.c,v 1.20, where the in_cksum() was made just a special version of in_cksum_skip(). The side effect was that now every incoming IP datagram failed to pass the checksum test (in_cksum() returned 0xffff when it should actually return zero). It was fixed next day in revision 1.21, by removing the INVERT expression. The latter also broke the 0 -> 0xffff fixup for UDP checksums.
Before this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 0000
After this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 ffff
|
#
74111 |
|
11-Mar-2001 |
iedowse |
In ip_output(), initialise `ia' in the case where the packet has come from a dummynet pipe. Without this, the code which increments the per-ifaddr stats can dereference an uninitialised pointer. This should make dummynet usable again.
Reported by: "Dmitry A. Yanko" <fm@astral.ntu-kpi.kiev.ua> Reviewed by: luigi, joe
|
#
73102 |
|
26-Feb-2001 |
asmodai |
Remove conditionals for vax support. People who care much about this are welcomed to try 2.11BSD. :)
Noticed by: luigi Reviewed by: jesper
|
#
72012 |
|
04-Feb-2001 |
phk |
Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with: sed(1) Reviewed by: md5(1)
|
#
71999 |
|
04-Feb-2001 |
phk |
Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details.
Created with: sed(1) Reviewed by: md5(1)
|
#
71909 |
|
02-Feb-2001 |
luigi |
MFS: bridge/ipfw/dummynet fixes (bridge.c will be committed separately)
|
#
71613 |
|
25-Jan-2001 |
luigi |
Pass up errors returned by dummynet. The same should be done with divert.
|
#
70254 |
|
21-Dec-2000 |
bmilekic |
* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT. This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while.
* Fix a typo in a comment in mbuf.h
* Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.
|
#
68150 |
|
01-Nov-2000 |
joe |
It's no longer true that "nobody uses ia beyond here"; it's now used to keep address based if_data statistics in.
Submitted by: ru
|
#
67893 |
|
29-Oct-2000 |
phk |
Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>.
Correctly document the #includes needed in the manpage.
Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.
|
#
67833 |
|
29-Oct-2000 |
joe |
Count per-address statistics for IP fragments.
Requested by: ru Obtained from: BSD/OS
|
#
67375 |
|
20-Oct-2000 |
ru |
Save a few CPU cycles in IP fragmentation code.
|
#
67334 |
|
19-Oct-2000 |
joe |
Augment the 'ifaddr' structure with a 'struct if_data' to keep statistics on a per network address basis.
Teach the IPv4 and IPv6 input/output routines to log packets/bytes against the network address connected to the flow.
Teach netstat to display the per-address stats for IP protocols when 'netstat -i' is evoked, instead of displaying the per-interface stats.
|
#
65837 |
|
14-Sep-2000 |
ru |
Follow BSD/OS and NetBSD, keep the ip_id field in network order all the time.
Requested by: wollman
|
#
65327 |
|
01-Sep-2000 |
ru |
Fixed broken ICMP error generation, unified conversion of IP header fields between host and network byte order. The details:
o icmp_error() now does not add IP header length. This fixes the problem when icmp_error() is called from ip_forward(). In this case the ip_len of the original IP datagram returned with ICMP error was wrong.
o icmp_error() expects all three fields, ip_len, ip_id and ip_off in host byte order, so DTRT and convert these fields back to network byte order before sending a message. This fixes the problem described in PR 16240 and PR 20877 (ip_id field was returned in host byte order).
o ip_ttl decrement operation in ip_forward() was moved down to make sure that it does not corrupt the copy of original IP datagram passed later to icmp_error().
o A copy of original IP datagram in ip_forward() was made a read-write, independent copy. This fixes the problem I first reported to Garrett Wollman and Bill Fenner and later put in audit trail of PR 16240: ip_output() (not always) converts fields of original datagram to network byte order, but because copy (mcopy) and its original (m) most likely share the same mbuf cluster, ip_output()'s manipulations on original also corrupted the copy.
o ip_output() now expects all three fields, ip_len, ip_off and (what is significant) ip_id in host byte order. It was a headache for years that ip_id was handled differently. The only compatibility issue here is the raw IP socket interface with IP_HDRINCL socket option set and a non-zero ip_id field, but ip.4 manual page was unclear on whether in this case ip_id field should be in host or network byte order.
|
#
64060 |
|
31-Jul-2000 |
darrenr |
activate pfil_hooks and covert ipfilter to use it
|
#
62587 |
|
04-Jul-2000 |
itojun |
sync with kame tree as of july00. tons of bug fixes/improvements.
API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)
|
#
61183 |
|
02-Jun-2000 |
jlemon |
Add boundary checks against IP options.
Obtained from: OpenBSD
|
#
60910 |
|
25-May-2000 |
jlemon |
Mark the checksum as complete when looping back multicast packets.
Submitted by: Jeff Gibbons <jgibbons@n2.net>
|
#
60889 |
|
24-May-2000 |
archie |
Just need to pass the address family to if_simloop(), not the whole sockaddr.
|
#
60765 |
|
21-May-2000 |
jlemon |
Compute the checksum before handing the packet off to IPFilter.
Tested by: Cy Schubert <Cy.Schubert@uumail.gov.bc.ca>
|
#
58936 |
|
02-Apr-2000 |
shin |
Move htons() ip_len to after the in_delayed_cksum() call. This should stop cksum error messages on IPsec communication which was reported on freebsd-current.
Reviewed by: jlemon
|
#
58895 |
|
01-Apr-2000 |
jlemon |
Calculate any delayed checksums before handing an mbuf off to a divert socket. This fixes a problem with ppp/natd.
Reviewed by: bsd (Brian Dean, gotta love that login name)
|
#
58806 |
|
30-Mar-2000 |
jlemon |
If `ipfw fwd' loops an mbuf back to ip_input from ip_output and the mbuf is marked for delayed checksums, then additionally mark the packet as having it's checksums computed. This allows us to bypass computing/checking the checksum entirely, which isn't really needeed as the packet has never hit the wire.
Reviewed by: green
|
#
58698 |
|
27-Mar-2000 |
jlemon |
Add support for offloading IP/TCP/UDP checksums to NIC hardware which supports them.
|
#
57855 |
|
09-Mar-2000 |
shin |
Initialize mbuf pointer at getting ipsec policy. Without this, kernel will panic at getsockopt() of IPSEC_POLICY. Also make compilable libipsec/test-policy.c which tries getsockopt() of IPSEC_POLICY.
Approved by: jkh
Submitted by: sakane@kame.net
|
#
57401 |
|
23-Feb-2000 |
guido |
Remove option IPFILTER_KLD. In case you wanted to kldload ipfilter, the module would only work in kernels built with this option.
Approved by: jkh
|
#
57114 |
|
10-Feb-2000 |
luigi |
Support the net.inet.ip.fw.enable variable, part of the recent ipfw modifications.
Approved-by: jordan
|
#
55777 |
|
10-Jan-2000 |
ru |
MGETHDR() does not initialize m_pkthdr.rcvif, do it here.
This fixes page fault panic observed when diverting packets with IP options (e.g. ping -R remoteIP over natd).
PR: kern/8596, kern/11199
|
#
55632 |
|
09-Jan-2000 |
shin |
enable IPsec over DUMMYNET again
Submitted by: luigi Reviewed by: luigi
|
#
55598 |
|
08-Jan-2000 |
luigi |
Cleanup dummynet call interface so it should now work on the Alpha as well. Also (probably) fix a bug introduced during the IPv6 import.
|
#
55009 |
|
22-Dec-1999 |
shin |
IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers.
Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
54175 |
|
06-Dec-1999 |
archie |
Miscellaneous fixes/cleanups relating to ipfw and divert(4):
- Implement 'ipfw tee' (finally) - Divert packets by calling new function divert_packet() directly instead of going through protosw[]. - Replace kludgey global variable 'ip_divert_port' with a function parameter to divert_packet() - Replace kludgey global variable 'frag_divert_port' with a function parameter to ip_reass() - style(9) fixes
Reviewed by: julian, green
|
#
50477 |
|
28-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
46420 |
|
04-May-1999 |
luigi |
Free the dummynet descriptor in ip_dummynet, not in the called routines. The descriptor contains parameters which could be used within those routines (eg. ip_output() ).
On passing, add IPPROTO_PGM entry to netinet/in.h
|
#
46393 |
|
04-May-1999 |
luigi |
forgot passing the right pointer to dst to dummynet_io(). (-stable and releng2 were already safe). Debugged-By: phk
|
#
45869 |
|
20-Apr-1999 |
peter |
Tidy up some stray / unused stuff in the IPFW package and friends. - unifdef -DCOMPAT_IPFW (this was on by default already) - remove traces of in-kernel ip_nat package, it was never committed. - Make IPFW and DUMMYNET initialize themselves rather than depend on compiled-in hooks in ip_init(). This means they initialize the same way both in-kernel and as kld modules. (IPFW initializes now :-)
|
#
44797 |
|
16-Mar-1999 |
luigi |
Fix a dummynet bug caused by passing a bad next hop address (the symptom was the msg "arp failure -- host is not on local network" that some user have seen on multihomed machines. Bug tracked down by Emmanuel Duros
|
#
44154 |
|
19-Feb-1999 |
luigi |
avoid panic with pkts larger than MTU and DF set coming out of a pipe.
|
#
41990 |
|
21-Dec-1998 |
luigi |
Restore 1.82->1.83 change deleted by mistake< per Bruce suggestion
|
#
41793 |
|
14-Dec-1998 |
luigi |
Last bits (i think) of dummynet for -current.
|
#
41059 |
|
10-Nov-1998 |
peter |
add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE()
|
#
38754 |
|
02-Sep-1998 |
wollman |
Properly fragment multicast packets.
PR: 7802 Submitted by: Steve McCanne <mccanne@cs.berkeley.edu>
|
#
38482 |
|
23-Aug-1998 |
wollman |
Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.
|
#
37996 |
|
01-Aug-1998 |
peter |
Fix a compile error if IPFIREWALL_FORWARD active without IPDIVERT.
|
#
37624 |
|
13-Jul-1998 |
bde |
Fixed some longs that should have been fixed-sized types.
|
#
37413 |
|
06-Jul-1998 |
julian |
Don't expect the new code to be used without the right option file being included.
|
#
37412 |
|
06-Jul-1998 |
julian |
Fix braino in switching to TAILQ macro.
|
#
37409 |
|
06-Jul-1998 |
julian |
Support for IPFW based transparent forwarding. Any packet that can be matched by a ipfw rule can be redirected transparently to another port or machine. Redirection to another port mostly makes sense with tcp, where a session can be set up between a proxy and an unsuspecting client. Redirection to another machine requires that the other machine also be expecting to receive the forwarded packets, as their headers will not have been modified.
/sbin/ipfw must be recompiled!!!
Reviewed by: Peter Wemm <peter@freebsd.org> Submitted by: Chrisy Luke <chrisy@flix.net>
|
#
37094 |
|
21-Jun-1998 |
bde |
Removed unused includes.
|
#
36995 |
|
15-Jun-1998 |
julian |
fix another typo
|
#
36992 |
|
14-Jun-1998 |
julian |
Try narrow down the culprit sending undefined packet types through the loopback
|
#
36908 |
|
12-Jun-1998 |
julian |
Go through the loopback code with a broom.. Remove lots'o'hacks. looutput is now static.
Other callers who want to use loopback to allow shortcutting should call the special entrypoint for this, if_simloop(), which is specifically designed for this purpose. Using looutput for this purpose was problematic, particularly with bpf and trying to keep track of whether one should be using the charateristics of the loopback interface or the interface (e.g. if_ethersubr.c) that was requesting the loopback. There was a whole class of errors due to this mis-use each of which had hacks to cover them up.
Consists largly of hack removal :-)
|
#
36710 |
|
06-Jun-1998 |
julian |
Make sure the default value of a dummy variable is 0 so that it doesn't do anything.
|
#
36708 |
|
06-Jun-1998 |
julian |
Fix wrong data type for a pointer.
|
#
36707 |
|
06-Jun-1998 |
julian |
clean up the changes made to ipfw over the last weeks (should make the ipfw lkm work again)
|
#
36678 |
|
05-Jun-1998 |
julian |
Reverse the default sense of the IPFW/DIVERT reinjection code so that the new behaviour is now default. Solves the "infinite loop in diversion" problem when more than one diversion is active. Man page changes follow.
The new code is in -stable as the NON default option.
|
#
36369 |
|
25-May-1998 |
julian |
Add optional code to change the way that divert and ipfw work together. Prior to this change, Accidental recursion protection was done by the diverted daemon feeding back the divert port number it got the packet on, as the port number on a sendto(). IPFW knew not to redivert a packet to this port (again). Processing of the ruleset started at the beginning again, skipping that divert port.
The new semantic (which is how we should have done it the first time) is that the port number in the sendto() is the rule number AFTER which processing should restart, and on a recvfrom(), the port number is the rule number which caused the diversion. This is much more flexible, and also more intuitive. If the user uses the same sockaddr received when resending, processing resumes at the rule number following that that caused the diversion. The user can however select to resume rule processing at any rule. (0 is restart at the beginning)
To enable the new code use
option IPFW_DIVERT_RESTART
This should become the default as soon as people have looked at it a bit
|
#
34746 |
|
21-Mar-1998 |
peter |
Make this compile.. There are some unpleasing hacks in here. A major unifdef session is sorely tempting but would destroy any remaining chance of tracking the original sources.
|
#
33678 |
|
20-Feb-1998 |
bde |
Don't depend on "implicit int".
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
31017 |
|
07-Nov-1997 |
phk |
Rename some local variables to avoid shadowing other local variables.
Found by: -Wshadow
|
#
30966 |
|
05-Nov-1997 |
joerg |
Make IPDIVERT a supported option. Alas, in_var.h depends on it, i hope i've found out all files that actually depend on this dependancy. IMHO, it's not very good practice to change the size of internal structs depending on kernel options.
|
#
30354 |
|
12-Oct-1997 |
phk |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them.
A couple of finer points by: bde
|
#
30309 |
|
11-Oct-1997 |
phk |
Distribute and statizice a lot of the malloc M_* types.
Substantial input from: bde
|
#
27845 |
|
02-Aug-1997 |
bde |
Removed unused #includes.
|
#
26359 |
|
02-Jun-1997 |
julian |
Submitted by: Whistle Communications (archie Cobbs)
these are quite extensive additions to the ipfw code. they include a change to the API because the old method was broken, but the user view is kept the same.
The new code allows a particular match to skip forward to a particular line number, so that blocks of rules can be used without checking all the intervening rules. There are also many more ways of rejecting connections especially TCP related, and many many more ...
see the man page for a complete description.
|
#
25516 |
|
06-May-1997 |
fenner |
Pull up the IP header in ip_mloopback(). This makes sure that the operations on the header inside ip_mloopback() are performed on a private copy instead of a shared cluster.
PR: kern/3410
|
#
25201 |
|
27-Apr-1997 |
wollman |
The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in.
2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials.
3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed.
4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family.
5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem.
As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.
|
#
24590 |
|
03-Apr-1997 |
darrenr |
Resolve conflicts created by import.
|
#
24570 |
|
03-Apr-1997 |
dg |
Reorganize elements of the inpcb struct to take better advantage of cache lines. Removed the struct ip proto since only a couple of chars were actually being used in it. Changed the order of compares in the PCB hash lookup to take advantage of partial cache line fills (on PPro).
Discussed-with: wollman
|
#
23221 |
|
28-Feb-1997 |
fenner |
Fix a comment and some commented-out code in ip_mloopback to reflect how multicast loopback really works.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22927 |
|
19-Feb-1997 |
darrenr |
change IP Filter hooks to match new 3.1.8 patches for FreeBSD
|
#
22531 |
|
10-Feb-1997 |
darrenr |
Add IP Filter hooks (from patches).
|
#
22212 |
|
02-Feb-1997 |
brian |
Reset ip_divert_ignore to zero immediately after use - also, set it in the first place, independent of whether sin->sin_port is set.
The result is that diverted packets that are being forwarded will be diverted once and only once on the way in (ip_input()) and again, once and only once on the way out (ip_output()) - twice in total. ICMP packets that don't contain a port will now also be diverted.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
20407 |
|
13-Dec-1996 |
wollman |
Convert the interface address and IP interface address structures to TAILQs. Fix places which referenced these for no good reason that I can see (the references remain, but were fixed to compile again; they are still questionable).
|
#
19622 |
|
11-Nov-1996 |
fenner |
Add the IP_RECVIF socket option, which supplies a packet's incoming interface using a sockaddr_dl.
Fix the other packet-information socket options (SO_TIMESTAMP, IP_RECVDSTADDR) to work for multicast UDP and raw sockets as well. (They previously only worked for unicast UDP).
|
#
19113 |
|
22-Oct-1996 |
sos |
Changed args to the nat functions.
|
#
18797 |
|
07-Oct-1996 |
wollman |
All three files: make COMPAT_IPFW==0 case work again. ip_input.c: - delete some dusty code - _IP_VHL - use fast inline header checksum when possible
|
#
17758 |
|
21-Aug-1996 |
sos |
Add hooks for an IP NAT module, much like the firewall stuff... Move the sockopt definitions for the firewall code from ip_fw.h to in.h where it belongs.
|
#
17072 |
|
10-Jul-1996 |
julian |
Adding changes to ipfw and the kernel to support ip packet diversion.. This stuff should not be too destructive if the IPDIVERT is not compiled in.. be aware that this changes the size of the ip_fw struct so ipfw needs to be recompiled to use it.. more changes coming to clean this up.
|
#
16206 |
|
08-Jun-1996 |
bde |
Changed some memcpy()'s back to bcopy()'s.
gcc only inlines memcpy()'s whose count is constant and didn't inline these. I want memcpy() in the kernel go away so that it's obvious that it doesn't need to be optimized. Now it is only used for one struct copy in si.c.
|
#
15869 |
|
22-May-1996 |
wollman |
Conditionalize calls to IPFW code on COMPAT_IPFW. This is done slightly unconventionally: If COMPAT_IPFW is not defined, or if it is defined to 1, enable; otherwise, disable.
This means that these changes actually have no effect on anyone at the moment. (It just makes it easier for me to keep my code in sync.) In the future, the `not defined' part of the hack should be eliminated, but doing this now would require everyone to change their config files.
The same conditionals need to be made in ip_input.c as well for this to ave any useful effect, but I'm not ready to do that right now.
|
#
15850 |
|
21-May-1996 |
peter |
Fix an embarresing error on my part that made the IP_PORTRANGE options return a failure code (even though it worked). This commit brought to you by the 'C' keyword "break".. :-)
|
#
15652 |
|
06-May-1996 |
wollman |
Add three new route flags to help determine what sort of address the destination represents. For IP:
- Iff it is a host route, RTF_LOCAL and RTF_BROADCAST indicate local (belongs to this host) and broadcast addresses, respectively.
- For all routes, RTF_MULTICAST is set if the destination is multicast.
The RTF_BROADCAST flag is used by ip_output() to eliminate a call to in_broadcast() in a common case; this gives about 1% in our packet-generation experiments. All three flags might be used (although they aren't now) to determine whether a packet can be forwarded; a given host route can represent a forwardable address if:
(rt->rt_flags & (RTF_HOST | RTF_LOCAL | RTF_BROADCAST | RTF_MULTICAST)) == RTF_HOST
Obviously, one still has to do all the work if a host route is not present, but this code allows one to cache the results of such a lookup if rtalloc1() is called without masking RTF_PRCLONING.
|
#
15335 |
|
21-Apr-1996 |
bde |
Fixed in-line IP header checksumming. It was performed on the wrong header in one case.
|
#
15295 |
|
18-Apr-1996 |
wollman |
Three speed-ups in the output path (two small, one substantial):
1) Require all callers to pass a valid route pointer to ip_output() so that we don't have to check and allocate one off the stack as was done before. This eliminates one test and some stack bloat from the common (UDP and TCP) case.
2) Perform the IP header checksum in-line if it's of the usual length. This results in about a 5% speed-up in my packet-generation test.
3) Use ip_vhl field rather than ip_v and ip_hl bitfields.
|
#
15026 |
|
03-Apr-1996 |
phk |
Add feature for tcp "established". Change interface between netinet and ip_fw to be more general, and thus hopefully also support other ip filtering implementations.
|
#
14823 |
|
26-Mar-1996 |
fenner |
Add missing splx(s) in IP_MULTICAST_IF
Submitted by: Jim Binkley <jrb@cs.pdx.edu>
|
#
14611 |
|
13-Mar-1996 |
pst |
Fix ip option processing for raw IP sockets. This whole thing is a compromise between ignoring options specified in the setsockopt call if IP_HDRINCL is set (the UCB choice when VJ's code was brought in) vs allowing them (what everyone else did, and what is assumed by programs everywhere...sigh).
Also perform some checking of the passed down packet to avoid running off the end of a mbuf chain.
Reviewed by: fenner
|
#
14546 |
|
11-Mar-1996 |
dg |
Move or add #include <queue.h> in preparation for upcoming struct socket changes.
|
#
14230 |
|
24-Feb-1996 |
phk |
The new firewall functionality: Filter on the direction (in/out). Filter on fragment/not fragment.
|
#
14209 |
|
23-Feb-1996 |
phk |
Big sweep over the IPFIREWALL and IPACCT code.
Close the ip-fragment hole. Waste less memory. Rewrite to contemporary more readable style. Kill separate IPACCT facility, use "accept" rules in IPFIREWALL. Filter incoming >and< outgoing packets. Replace "policy" by sticky "deny all" rule. Rules have numbers used for ordering and deletion. Remove "rerorder" code entirely. Count packet & bytecount matches for rules.
Code in -current & -stable is now the same.
|
#
14195 |
|
22-Feb-1996 |
peter |
Make the default behavior of local port assignment match traditional systems (my last change did not mix well with some firewall configurations). As much as I dislike firewalls, this is one thing I I was not prepared to break by default.. :-)
Allow the user to nominate one of three ranges of port numbers as candidates for selecting a local address to replace a zero port number. The ranges are selected via a setsockopt(s, IPPROTO_IP, IP_PORTRANGE, &arg) call. The three ranges are: default, high (to bypass firewalls) and low (to get a port below 1024).
The default and high port ranges are sysctl settable under sysctl net.inet.ip.portrange.*
This code also fixes a potential deadlock if the system accidently ran out of local port addresses. It'd drop into an infinite while loop.
The secure port selection (for root) should reduce overheads and increase reliability of rlogin/rlogind/rsh/rshd if they are modified to take advantage of it.
Partly suggested by: pst Reviewed by: wollman
|
#
12934 |
|
19-Dec-1995 |
wollman |
Added a comment about why trying to make a one-behind cache for the route in ip_output() is a bad idea.
|
#
12635 |
|
05-Dec-1995 |
wollman |
Path MTU Discovery is now standard.
|
#
12296 |
|
14-Nov-1995 |
phk |
New style sysctl & staticize alot of stuff.
|
#
11537 |
|
16-Oct-1995 |
wollman |
The ability to administratively change the MTU of an interface presents a few new wrinkles for MTU discovery which tcp_output() had better be prepared to handle. ip_output() is also modified to do something helpful in this case, since it has already calculated the information we need.
|
#
9728 |
|
26-Jul-1995 |
wollman |
Fix test for determining when RSVP is inactive in a router. (In this case, multicast options are not passed to ip_mforward().) The previous version had a wrong test, thus causing RSVP mrouters to forward RSVP messages in violation of the spec.
|
#
9386 |
|
02-Jul-1995 |
joerg |
Slightly modify my previous change to return EINVAL instead of EFAULT.
Submitted by: Peter Wemm
|
#
9383 |
|
01-Jul-1995 |
joerg |
I saw a very low-key commit message on the netbsd mailing lists and figured out what the problem was.. Anyway, I rate it as "highly serious".
Submitted by: peter@haywire.DIALix.COM (Peter Wemm)
|
#
9209 |
|
13-Jun-1995 |
wollman |
Kernel side of 3.5 multicast routing code, based on work by Bill Fenner and other work done here. The LKM support is probably broken, but it still compiles and will be fixed later.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8384 |
|
09-May-1995 |
dg |
Replaced some bcopy()'s with memcpy()'s so that gcc while inline/optimize.
|
#
8090 |
|
26-Apr-1995 |
pst |
Cleanup loopback interface support. Reviewed by: wollman
|
#
7684 |
|
09-Apr-1995 |
dg |
Implemented PCB hashing. Includes new functions in_pcbinshash, in_pcbrehash, and in_pcblookuphash.
|
#
7191 |
|
20-Mar-1995 |
wollman |
This should be splimp() rather than splnet() since ifaddrs might go away as a result of link-layer processing.
|
#
7190 |
|
20-Mar-1995 |
wollman |
Fix race conditions involved in setting IP multicast options. This should fix Dennis Fortin's problem for good, if I've got it figured out right.
(The problem was that a `struct ifaddr' could get deleted out from under the current requester, thus leaving him with an invalid interface pointer and causing even more bogus accesses.)
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
5543 |
|
12-Jan-1995 |
ugen |
Actual firewall change. 1) Firewall is not subdivided on forwarding / blocking chains anymore.Actually only one chain left-it was the blocking one. 2) LKM support.ip_fwdef.c is function pointers definition and goes into kernel along with all INET stuff.
|
#
5105 |
|
13-Dec-1994 |
wollman |
Call rtalloc_ign() so that protocol cloning will not occur at the IP layer.
|
#
5085 |
|
12-Dec-1994 |
ugen |
Add match by interface from which packet arrived (via) Handle right fragmented packets. Remove checking option from kernel..
|
#
4523 |
|
16-Nov-1994 |
jkh |
Ugen J.S.Antsilevich's latest, happiest, IP firewall code. Poul: Please take this into BETA. It's non-intrusive, and a rather substantial improvement over what was there before.
|
#
2754 |
|
14-Sep-1994 |
wollman |
Shuffle some functions and variables around to make it possible for multicast routing to be implemented as an LKM. (There's still a bit of work to do in this area.)
|
#
2628 |
|
09-Sep-1994 |
wollman |
Disable IPMULTICAST_VIF socket option when MROUTING is not defined, since it doesn'tmake any sense for non-routers. CVS:
|
#
2531 |
|
06-Sep-1994 |
wollman |
Initial get-the-easy-case-working upgrade of the multicast code to something more recent than the ancient 1.2 release contained in 4.4. This code has the following advantages as compared to previous versions (culled from the README file for the SunOS release):
- True multicast delivery - Configurable rate-limiting of forwarded multicast traffic on each physical interface or tunnel, using a token-bucket limiter. - Simplistic classification of packets for prioritized dropping. - Administrative scoping of multicast address ranges. - Faster detection of hosts leaving groups. - Support for multicast traceroute (code not yet available). - Support for RSVP, the Resource Reservation Protocol.
What still needs to be done:
- The multicast forwarder needs testing. - The multicast routing daemon needs to be ported. - Network interface drivers need to have the `#ifdef MULTICAST' goop ripped out of them. - The IGMP code should probably be bogon-tested.
Some notes about the porting process:
In some cases, the Berkeley people decided to incorporate functionality from later releases of the multicast code, but then had to do things differently. As a result, if you look at Deering's patches, and then look at our code, it is not always obvious whether the patch even applies. Let the reader beware.
I ran ip_mroute.c through several passes of `unifdef' to get rid of useless grot, and to permanently enable the RSVP support, which we will include as standard.
Ported by: Garrett Wollman Submitted by: Steve Deering and Ajit Thyagarajan (among others)
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1813 |
|
01-Aug-1994 |
dg |
fixed bug where large amounts of unidirectional UDP traffic would fill the interface output queue and further udp packets would be fragmented and only partially sent - keeping the output queue full and jamming the network, but not actually getting any real work done (because you can't send just 'part' of a udp packet - if you fragment it, you must send the whole thing). The fix involves adding a check to make sure that the output queue has sufficient space for all of the fragments.
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|