#
341260 |
|
29-Nov-2018 |
markj |
MFC r340483 (by jtl): Add some additional length checks to the IPv4 fragmentation code.
|
#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
301114 |
|
01-Jun-2016 |
bz |
The pr_destroy field does not allow us to run the teardown code in a specific order. VNET_SYSUNINITs however are doing exactly that. Thus remove the VIMAGE conditional field from the domain(9) protosw structure and replace it with VNET_SYSUNINITs. This also allows us to change some order and to make the teardown functions file local static. Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use internally.
Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g., for pfil consumers (firewalls), partially for this commit and for others to come.
Reviewed by: gnn, tuexen (sctp), jhb (kernel.h) Obtained from: projects/vnet MFC after: 2 weeks X-MFC: do not remove pr_destroy Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6652
|
#
292015 |
|
09-Dec-2015 |
melifaro |
Make in_arpinput(), inp_lookup_mcast_ifp(), icmp_reflect(), ip_dooptions(), icmp6_redirect_input(), in6_lltable_rtcheck(), in6p_lookup_mcast_ifp() and in6_selecthlim() use new routing api.
Eliminate now-unused ip_rtaddr(). Fix lookup key fib6_lookup_nh_basic() which was lost diring merge. Make fib6_lookup_nh_basic() and fib6_lookup_nh_extended() always return IPv6 destination address with embedded scope. Currently rw_gateway has it scope embedded, do the same for non-gatewayed destinations.
Sponsored by: Yandex LLC
|
#
285677 |
|
18-Jul-2015 |
luigi |
fix a typo in a comment
|
#
280971 |
|
01-Apr-2015 |
glebius |
o Use new function ip_fillid() in all places throughout the kernel, where we want to create a new IP datagram. o Add support for RFC6864, which allows to set IP ID for atomic IP datagrams to any value, to improve performance. The behaviour is controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by default. o In case if we generate IP ID, use counter(9) to improve performance. o Gather all code related to IP ID into ip_id.c.
Differential Revision: https://reviews.freebsd.org/D2177 Reviewed by: adrian, cy, rpaulo Tested by: Emeric POUPON <emeric.poupon stormshield.eu> Sponsored by: Netflix Sponsored by: Nginx, Inc. Relnotes: yes
|
#
280759 |
|
27-Mar-2015 |
fabient |
On multi CPU systems, we may emit successive packets with the same id. Fix the race by using an atomic operation.
Differential Revision: https://reviews.freebsd.org/D2141 Obtained from: emeric.poupon@stormshield.eu MFC after: 1 week Sponsored by: Stormshield
|
#
271291 |
|
08-Sep-2014 |
adrian |
Add a flag to ip_output() - IP_NODEFAULTFLOWID - which prevents it from overriding an existing flowid/flowtype field in the outbound mbuf with the inp_flowid/inp_flowtype details.
The upcoming RSS UDP support calculates a valid RSS value for outbound mbufs and since it may change per send, it doesn't cache it in the inpcb. So overriding it here would be wrong.
Differential Revision: https://reviews.freebsd.org/D527 Reviewed by: grehan
|
#
270008 |
|
15-Aug-2014 |
kevlo |
Change pr_output's prototype to avoid the need for explicit casts. This is a follow up to r269699.
Phabric: D564 Reviewed by: jhb
|
#
269699 |
|
08-Aug-2014 |
kevlo |
Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have only one protocol switch structure that is shared between ipv4 and ipv6.
Phabric: D476 Reviewed by: jhb
|
#
263091 |
|
12-Mar-2014 |
glebius |
Since both netinet/ and netinet6/ call into netipsec/ and netpfil/, the protocol specific mbuf flags are shared between them.
- Move all M_FOO definitions into a single place: netinet/in6.h, to avoid future clashes. - Resolve clash between M_DECRYPTED and M_SKIP_FIREWALL which resulted in a failure of operation of IPSEC and packet filters.
Thanks to Nicolas and Georgios for all the hard work on bisecting, testing and finally finding the root of the problem.
PR: kern/186755 PR: kern/185876 In collaboration with: Georgios Amanakis <gamanakis gmail.com> In collaboration with: Nicolas DEFFAYET <nicolas-ml deffayet.com> Sponsored by: Nginx, Inc.
|
#
254519 |
|
19-Aug-2013 |
andre |
Move the global M_SKIP_FIREWALL mbuf flags to a protocol layer specific flag instead. The flag is only used within the IP and IPv6 layer 3 protocols.
Because some firewall packages treat IPv4 and IPv6 packets the same the flag should have the same value for both.
Discussed with: trociny, glebius
|
#
254518 |
|
19-Aug-2013 |
andre |
Move ip_reassemble()'s use of the global M_FRAG mbuf flag to a protocol layer specific flag instead. The flag is only relevant while the packet stays in the IP reassembly queue.
Discussed with: trociny, glebius
|
#
253083 |
|
09-Jul-2013 |
ae |
Use new macros to implement ipstat and tcpstat using PCPU counters. Change interface of kread_counters() similar ot kread() in the netstat(1).
|
#
249411 |
|
12-Apr-2013 |
ae |
Reflect removing of the counter_u64_subtract() function in the macro.
|
#
249276 |
|
08-Apr-2013 |
glebius |
Merge from projects/counters: TCP/IP stats.
Convert 'struct ipstat' and 'struct tcpstat' to counter(9).
This speeds up IP forwarding at extreme packet rates, and makes accounting more precise.
Sponsored by: Nginx, Inc.
|
#
242463 |
|
01-Nov-2012 |
ae |
Remove the recently added sysctl variable net.pfil.forward. Instead, add protocol specific mbuf flags M_IP_NEXTHOP and M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup only when this flag is set.
Suggested by: andre
|
#
242161 |
|
26-Oct-2012 |
glebius |
o Remove last argument to ip_fragment(), and obtain all needed information on checksums directly from mbuf flags. This simplifies code. o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in hardware. Some driver may not announce CSUM_IP in theur if_hwassist, although try to do checksums if CSUM_IP set on mbuf. Example is em(4). o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP. After this change CSUM_DELAY_IP vanishes from the stack.
Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>
|
#
241406 |
|
10-Oct-2012 |
melifaro |
Do not check if found IPv4 rte is dynamic if net.inet.icmp.drop_redirect is enabled. This eliminates one mtx_lock() per each routing lookup thus improving performance in several cases (routing to directly connected interface or routing to default gateway).
Icmp redirects should not be used to provide routing direction nowadays, even for end hosts. Routers should not use them too (and this is explicitly restricted in IPv6, see RFC 4861, clause 8.2).
Current commit changes rnh_machaddr function to 'stock' rn_match (and back) for every AF_INET routing table in given VNET instance on drop_redirect sysctl change.
This change is part of bigger patch eliminating rte locking.
Sponsored by: Yandex LLC MFC after: 2 weeks
|
#
240099 |
|
04-Sep-2012 |
melifaro |
Introduce new link-layer PFIL hook V_link_pfil_hook. Merge ether_ipfw_chk() and part of bridge_pfil() into unified ipfw_check_frame() function called by PFIL. This change was suggested by rwatson? @ DevSummit.
Remove ipfw headers from ether/bridge code since they are unneeded now.
Note this thange introduce some (temporary) performance penalty since PFIL read lock has to be acquired for every link-level packet.
MFC after: 3 weeks
|
#
228969 |
|
29-Dec-2011 |
jhb |
Defer the work of freeing IPv4 multicast options from a socket to an asychronous task. This avoids tearing down multicast state including sending IGMP leave messages and reprogramming MAC filters while holding the per-protocol global pcbinfo lock that is used in the receive path of packet processing.
Reviewed by: rwatson MFC after: 1 month
|
#
223666 |
|
29-Jun-2011 |
ae |
Add new rule actions "call" and "return" to ipfw. They make possible to organize subroutines with rules.
The "call" action saves the current rule number in the internal stack and rules processing continues from the first rule with specified number (similar to skipto action). If later a rule with "return" action is encountered, the processing returns to the first rule with number of "call" rule saved in the stack plus one or higher.
Submitted by: Vadim Goncharov Discussed by: ipfw@, luigi@
|
#
220879 |
|
20-Apr-2011 |
bz |
MFp4 CH=191470:
Move the ipport_tick_callout and related functions from ip_input.c to in_pcb.c. The random source port allocation code has been merged and is now local to in_pcb.c only. Use a SYSINIT to get the callout started and no longer depend on initialization from the inet code, which would not work in an IPv6 only setup.
Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days
|
#
212155 |
|
02-Sep-2010 |
bz |
MFp4 CH=183052 183053 183258:
In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number(*) and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling.
(*) the functions have been without any in-tree consumer since the initial introducation, so this is considered save.
Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically.
Reviewed by: philip, will MFC after: 1 week
|
#
207369 |
|
29-Apr-2010 |
bz |
MFP4: @176978-176982, 176984, 176990-176994, 177441
"Whitspace" churn after the VIMAGE/VNET whirls.
Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed.
Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9.
This also removes some header file pollution for putatively static global variables.
Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed.
Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days
|
#
204140 |
|
20-Feb-2010 |
bz |
Split up ip_drain() into an outer lock and iterator part and a "locked" version that will only handle a single network stack instance. The latter is called directly from ip_destroy().
Hook up an ip_destroy() function to release resources from the legacy IP network layer upon virtual network stack teardown.
Sponsored by: ISPsystem Reviewed by: rwatson MFC After: 5 days
|
#
201735 |
|
07-Jan-2010 |
luigi |
Following up on a request from Ermal Luci to make ip_divert work as a client of pf(4), make ip_divert not depend on ipfw.
This is achieved by moving to ip_var.h the struct ipfw_rule_ref (which is part of the mtag for all reinjected packets) and other declarations of global variables, and moving to raw_ip.c global variables for filter and divert hooks.
Note that names and locations could be made more generic (ipfw_rule_ref is really a generic reference robust to reconfigurations; the packet filter is not necessarily ipfw; filters and their clients are not necessarily limited to ipv4), but _right now_ most of this stuff works on ipfw and ipv4, so i don't feel like doing a gratuitous renaming, at least for the time being.
|
#
197952 |
|
11-Oct-2009 |
julian |
Virtualize the pfil hooks so that different jails may chose different packet filters. ALso allows ipfw to be enabled on on ejail and disabled on another. In 8.0 it's a global setting.
Sitting aroung in tree waiting to commit for: 2 months MFC after: 2 months
|
#
196039 |
|
02-Aug-2009 |
rwatson |
Many network stack subsystems use a single global data structure to hold all pertinent statatistics for the subsystem. These structures are sometimes "borrowed" by kernel modules that require a place to store statistics for similar events.
Add KPI accessor functions for statistics structures referenced by kernel modules so that they no longer encode certain specifics of how the data structures are named and stored. This change is intended to make it easier to move to per-CPU network stats following 8.0-RELEASE.
The following modules are affected by this change:
if_bridge if_cxgb if_gif ip_mroute ipdivert pf
In practice, most of these statistics consumers should, in fact, maintain their own statistics data structures rather than borrowing structures from the base network stack. However, that change is too agressive for this point in the release cycle.
Reviewed by: bz Approved by: re (kib)
|
#
195727 |
|
16-Jul-2009 |
rwatson |
Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references.
Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
|
#
195699 |
|
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
194357 |
|
17-Jun-2009 |
bz |
Add the explicit include of vimage.h to another five .c files still missing it.
Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.
|
#
193731 |
|
08-Jun-2009 |
zec |
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time.
Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
|
#
193502 |
|
05-Jun-2009 |
luigi |
More cleanup in preparation of ipfw relocation (no actual code change):
+ move ipfw and dummynet hooks declarations to raw_ip.c (definitions in ip_var.h) same as for most other global variables. This removes some dependencies from ip_input.c;
+ remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly;
+ remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly;
+ move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it;
To be merged together with rev 193497
MFC after: 5 days
|
#
190951 |
|
11-Apr-2009 |
rwatson |
Update stats in struct ipstat using four new macros, IPSTAT_ADD(), IPSTAT_INC(), IPSTAT_SUB(), and IPSTAT_DEC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures.
MFC after: 3 days
|
#
189592 |
|
09-Mar-2009 |
bms |
Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSD IPv4 stack.
Diffs are minimized against p4. PCS has been used for some protocol verification, more widespread testing of recorded sources in Group-and-Source queries is needed. sizeof(struct igmpstat) has changed.
__FreeBSD_version is bumped to 800070.
|
#
185937 |
|
11-Dec-2008 |
bz |
Put a global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL.
Start putting the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely.
While there garbage collect a few dead externs from ip6_var.h.
Sponsored by: The FreeBSD Foundation
|
#
185895 |
|
10-Dec-2008 |
zec |
Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option.
Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.
Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively
#ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif
Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs.
Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c.
Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS.
De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import.
Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions.
Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
185419 |
|
28-Nov-2008 |
zec |
Unhide declarations of network stack virtualization structs from underneath #ifdef VIMAGE blocks.
This change introduces some churn in #include ordering and nesting throughout the network stack and drivers but is not expected to cause any additional issues.
In the next step this will allow us to instantiate the virtualization container structures and switch from using global variables to their "containerized" counterparts.
Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
182146 |
|
25-Aug-2008 |
julian |
Another V_ forgotten
|
#
181803 |
|
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
#
178888 |
|
09-May-2008 |
julian |
Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x)
Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux.
From my notes:
-----
One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address.
Constraints: ------------
I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need.
One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing".
One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch.
This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs.
To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family.
The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before.
The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row.
In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later.
One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically).
You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it.
This brings us as to how the correct FIB is selected for an outgoing IPV4 packet.
Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways.
Packets fall into one of a number of classes.
1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice..
setfib -3 ping target.example.com # will use fib 3 for ping.
It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands.
2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.)
3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2).
4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib.
5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to.
6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1.
Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented)
In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB.
In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process.
Early testing experience: -------------------------
Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks.
For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done.
Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly.
ipfw has grown 2 new keywords:
setfib N ip from anay to any count ip from any to any fib N
In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required.
SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something.
Where to next: --------------------
After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code.
Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code.
My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it.
When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry.
Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already.
This work was sponsored by Ironport Systems/Cisco
Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco
|
#
170613 |
|
12-Jun-2007 |
bms |
Import rewrite of IPv4 socket multicast layer to support source-specific and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work.
This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project.
The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html
Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING.
This work was financially supported by another FreeBSD committer.
Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)
|
#
168365 |
|
04-Apr-2007 |
andre |
Some local and style(9) cleanups.
|
#
162616 |
|
25-Sep-2006 |
bms |
Forced commit to note this change should be MFCed.
MFC after: 1 week
|
#
158563 |
|
14-May-2006 |
bms |
Fix a long-standing limitation in IPv4 multicast group membership.
By making the imo_membership array a dynamically allocated vector, this minimizes disruption to existing IPv4 multicast code. This change breaks the ABI for the kernel module ip_mroute.ko, and may cause a small amount of churn for folks working on the IGMPv3 merge.
Previously, sockets were subject to a compile-time limitation on the number of IPv4 group memberships, which was hard-coded to 20. The imo_membership relationship, however, is 1:1 with regards to a tuple of multicast group address and interface address. Users who ran routing protocols such as OSPF ran into this limitation on machines with a large system interface tree.
|
#
152608 |
|
19-Nov-2005 |
andre |
Move MAX_IPOPTLEN and struct ipoption back into ip_var.h as userland programs depend on it.
Pointed out by: le Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
152592 |
|
18-Nov-2005 |
andre |
Consolidate all IP Options handling functions into ip_options.[ch] and include ip_options.h into all files making use of IP Options functions.
From ip_input.c rev 1.306: ip_dooptions(struct mbuf *m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt)
From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb *inp, int optname, struct mbuf *m)
No functional changes in this commit.
Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
|
#
147744 |
|
02-Jul-2005 |
thompsa |
Check the alignment of the IP header before passing the packet up to the packet filter. This would cause a panic on architectures that require strict alignment such as sparc64 (tier1) and ia64/ppc (tier2).
This adds two new macros that check the alignment, these are compile time dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where alignment isn't need so the cost is avoided.
IP_HDR_ALIGNED_P() IP6_HDR_ALIGNED_P()
Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment is checked for ipfw and dummynet too.
PR: ia64/81284 Obtained from: NetBSD Approved by: re (dwhite), mlaier (mentor)
|
#
139823 |
|
06-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
139558 |
|
01-Jan-2005 |
silby |
Port randomization leads to extremely fast port reuse at high connection rates, which is causing problems for some users.
To retain the security advantage of random ports and ensure correct operation for high connection rate users, disable port randomization during periods of high connection rates.
Whenever the connection rate exceeds randomcps (10 by default), randomization will be disabled for randomtime (45 by default) seconds. These thresholds may be tuned via sysctl.
Many thanks to Igor Sysoev, who proved the necessity of this change and tested many preliminary versions of the patch.
MFC After: 20 seconds
|
#
136694 |
|
19-Oct-2004 |
andre |
Support for dynamically loadable and unloadable IP protocols in the ipmux.
With pr_proto_register() it has become possible to dynamically load protocols within the PF_INET domain. However the PF_INET domain has a second important structure called ip_protox[] that is derived from the 'struct protosw inetsw[]' and takes care of the de-multiplexing of the various protocols that ride on top of IP packets.
The functions ipproto_[un]register() allow to dynamically adjust the ip_protox[] array mux in a consistent and easy way. To register a protocol within ip_protox[] the existence of a corresponding and matching protocol definition in inetsw[] is required. The function does not allow to overwrite an already registered protocol. The unregister function simply replaces the mux slot with the default index pointer to IPPROTO_RAW as it was previously.
|
#
135274 |
|
15-Sep-2004 |
andre |
Remove the last two global variables that are used to store packet state while it travels through the IP stack. This wasn't much of a problem because IP source routing is disabled by default but when enabled together with SMP and preemption it would have very likely cross-corrupted the IP options in transit.
The IP source route options of a packet are now stored in a mtag instead of the global variable.
|
#
134383 |
|
27-Aug-2004 |
andre |
Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack.
If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active.
Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
|
#
133920 |
|
17-Aug-2004 |
andre |
Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different.
However there are many changes how ipfw is and its add-on's are handled:
In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler.
IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output().
ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it.
DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection.
BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS.
More detailed changes to the code:
conf/files Add netinet/ip_fw_pfil.c.
conf/options Add IPFIREWALL_FORWARD option.
modules/ipfw/Makefile Add ip_fw_pfil.c.
net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well.
netinet/ip_divert.c Removed divert_clone() function. It is no longer used.
netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed.
netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code.
netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args.
netinet/ip_fw2.c (Re)moved some global variables and the module handling.
netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization.
netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set.
netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code.
netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.)
netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active.
netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags.
netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here.
sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed.
Approved by: re (scottl)
|
#
133720 |
|
14-Aug-2004 |
dwmalone |
Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD have already done this, so I have styled the patch on their work:
1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID.
2) named the sysctl net.inet.ip.random_id
3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns.
The sysctl defaults to 0 (sequential IP IDs).
Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months
|
#
129017 |
|
06-May-2004 |
andre |
Provide the sysctl net.inet.ip.process_options to control the processing of IP options.
net.inet.ip.process_options=0 Ignore IP options and pass packets unmodified. net.inet.ip.process_options=1 Process all IP options (default). net.inet.ip.process_options=2 Reject all packets with IP options with ICMP filter prohibited message.
This sysctl affects packets destined for the local host as well as those only transiting through the host (routing).
IP options do not have any legitimate purpose anymore and are only used to circumvent firewalls or to exploit certain behaviours or bugs in TCP/IP stacks.
Reviewed by: sam (mentor)
|
#
128816 |
|
02-May-2004 |
darrenr |
Rename ip_claim_next_hop() to m_claim_next_hop(), give it an extra arg (the type of tag to claim) and push it out of ip_var.h into mbuf.h alongside all of the other macros that work ok mbuf's and tag's.
|
#
128019 |
|
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
#
126239 |
|
25-Feb-2004 |
mlaier |
Re-remove MT_TAGs. The problems with dummynet have been fixed now.
Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam
|
#
125952 |
|
17-Feb-2004 |
mlaier |
Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is not working properly with the patch in place.
Approved by: bms(mentor)
|
#
125784 |
|
13-Feb-2004 |
mlaier |
This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag).
This is (mostly) work from: sam
Silence from: -arch Approved by: bms(mentor), sam, rwatson
|
#
122723 |
|
14-Nov-2003 |
andre |
Make ipstealth global as we need it in ip_fastforward too.
|
#
122708 |
|
14-Nov-2003 |
andre |
Remove the global one-level rtcache variable and associated complex locking and rework ip_rtaddr() to do its own rtlookup. Adopt all its callers to this and make ip_output() callable with NULL rt pointer.
Reviewed by: sam (mentor)
|
#
122524 |
|
12-Nov-2003 |
rwatson |
Modify the MAC Framework so that instead of embedding a (struct label) in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures.
This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability.
While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory.
NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol.
Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
122331 |
|
08-Nov-2003 |
sam |
divert socket fixups:
o pickup Giant in divert_packet to protect sbappendaddr since it can be entered through MPSAFE callouts or through ip_input when mpsafenet is 1 o add missing locking on output o add locking to abort and shutdown o add a ctlinput handler to invalidate held routing table references on an ICMP redirect (may not be needed)
Supported by: FreeBSD Foundation
|
#
121093 |
|
14-Oct-2003 |
sam |
Lock ip forwarding route cache. While we're at it, remove the global variable ipforward_rt by introducing an ip_forward_cacheinval() call to use to invalidate the cache.
Supported by: FreeBSD Foundation
|
#
120386 |
|
23-Sep-2003 |
sam |
o update PFIL_HOOKS support to current API used by netbsd o revamp IPv4+IPv6+bridge usage to match API changes o remove pfil_head instances from protosw entries (no longer used) o add locking o bump FreeBSD version for 3rd party modules
Heavy lifting by: "Max Laier" <max@love2party.net> Supported by: FreeBSD Foundation Obtained from: NetBSD (bits of pfil.h and pfil.c)
|
#
119178 |
|
20-Aug-2003 |
bms |
Add the IP_ONESBCAST option, to enable undirected IP broadcasts to be sent on specific interfaces. This is required by aodvd, and may in future help us in getting rid of the requirement for BPF from our import of isc-dhcp.
Suggested by: fenestro Obtained from: BSD/OS Reviewed by: mini, sam Approved by: jake (mentor)
|
#
118622 |
|
07-Aug-2003 |
hsu |
1. Basic PIM kernel support Disabled by default. To enable it, the new "options PIM" must be added to the kernel configuration file (in addition to MROUTING):
options MROUTING # Multicast routing options PIM # Protocol Independent Multicast
2. Add support for advanced multicast API setup/configuration and extensibility.
3. Add support for kernel-level PIM Register encapsulation. Disabled by default. Can be enabled by the advanced multicast API.
4. Implement a mechanism for "multicast bandwidth monitoring and upcalls".
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
|
#
112985 |
|
02-Apr-2003 |
mdodd |
Back out support for RFC3514.
RFC3514 poses an unacceptale risk to compliant systems.
|
#
112929 |
|
01-Apr-2003 |
mdodd |
Implement support for RFC 3514 (The Security Flag in the IPv4 Header). (See: ftp://ftp.rfc-editor.org/in-notes/rfc3514.txt)
This fulfills the host requirements for userland support by way of the setsockopt() IP_EVIL_INTENT message.
There are three sysctl tunables provided to govern system behavior.
net.inet.ip.rfc3514:
Enables support for rfc3514. As this is an Informational RFC and support is not yet widespread this option is disabled by default.
net.inet.ip.hear_no_evil
If set the host will discard all received evil packets.
net.inet.ip.speak_no_evil
If set the host will discard all transmitted evil packets.
The IP statistics counter 'ips_evil' (available via 'netstat') provides information on the number of 'evil' packets recieved.
For reference, the '-E' option to 'ping' has been provided to demonstrate and test the implementation.
|
#
111244 |
|
22-Feb-2003 |
silby |
Add the ability to limit the number of IP fragments allowed per packet, and enable it by default, with a limit of 16.
At the same time, tweak maxfragpackets downward so that in the worst possible case, IP reassembly can use only 1/2 of all mbuf clusters.
MFC after: 3 days Reviewed by: hsu Liked by: bmah
|
#
107112 |
|
20-Nov-2002 |
luigi |
Back out the ip_fragment() code -- it is not urgent to have it in now, I will put it back in in a better form after 5.0 is out.
Requested by: sam, rwatson, luigi (on second thought) Approved by: re
|
#
107020 |
|
17-Nov-2002 |
luigi |
Move the ip_fragment code from ip_output() to a separate function, so that it can be reused elsewhere (there is a number of places where it can be useful). This also trims some 200 lines from the body of ip_output(), which helps readability a bit.
(This change was discussed a few weeks ago on the mailing lists, Julian agreed, silence from others. It is not a functional change, so i expect it to be ok to commit it now but i am happy to back it out if there are objections).
While at it, fix some function headers and replace m_copy() with m_copypacket() where applicable.
MFC after: 1 week
|
#
106968 |
|
15-Nov-2002 |
luigi |
Massive cleanup of the ip_mroute code.
No functional changes, but:
+ the mrouting module now should behave the same as the compiled-in version (it did not before, some of the rsvp code was not loaded properly); + netinet/ip_mroute.c is now truly optional; + removed some redundant/unused code; + changed many instances of '0' to NULL and INADDR_ANY as appropriate; + removed several static variables to make the code more SMP-friendly; + fixed some minor bugs in the mrouting code (mostly, incorrect return values from functions).
This commit is also a prerequisite to the addition of support for PIM, which i would like to put in before DP2 (it does not change any of the existing APIs, anyways).
Note, in the process we found out that some device drivers fail to properly handle changes in IFF_ALLMULTI, leading to interesting behaviour when a multicast router is started. This bug is not corrected by this commit, and will be fixed with a separate commit.
Detailed changes: -------------------- netinet/ip_mroute.c all the above. conf/files make ip_mroute.c optional net/route.c fix mrt_ioctl hook netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here together with other rsvp code, and a couple of indentation fixes. netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks netinet/ip_var.h rsvp function hooks netinet/raw_ip.c hooks for mrouting and rsvp functions, plus interface cleanup. netinet/ip_mroute.h remove an unused and optional field from a struct
Most of the code is from Pavlin Radoslavov and the XORP project
Reviewed by: sam MFC after: 1 week
|
#
105586 |
|
20-Oct-2002 |
phk |
Fix two instances of variant struct definitions in sys/netinet:
Remove the never completed _IP_VHL version, it has not caught on anywhere and it would make us incompatible with other BSD netstacks to retain this version.
Add a CTASSERT protecting sizeof(struct ip) == 20.
Don't let the size of struct ipq depend on the IPDIVERT option.
This is a functional no-op commit.
Approved by: re
|
#
105194 |
|
15-Oct-2002 |
sam |
Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version
Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month
|
#
101920 |
|
15-Aug-2002 |
rwatson |
Perform a nested include of _label.h if #ifdef _KERNEL. This will satisfy consumers of ip_var.h that need a complete definition of struct ipq and don't include mac.h.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
100993 |
|
30-Jul-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Label IP fragment reassembly queues, permitting security features to be maintained on those objects. ipq_label will be used to manage the reassembly of fragments into IP datagrams using security properties. This permits policies to deny the reassembly of fragments, as well as influence the resulting label of a datagram following reassembly.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
100419 |
|
20-Jul-2002 |
rwatson |
Don't export 'struct ipq' from kernel, instead #ifdef _KERNEL. As kernel data structures pick up security and synchronization primitives, it becomes increasingly desirable not to arbitrarily export them via include files to userland, as the userland applications pick up new #include dependencies.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
98663 |
|
23-Jun-2002 |
luigi |
Remove ip_fw_fwd_addr (forgotten in previous commit) remove some extra whitespace.
|
#
98613 |
|
22-Jun-2002 |
luigi |
Remove (almost all) global variables that were used to hold packet forwarding state ("annotations") during ip processing. The code is considerably cleaner now.
The variables removed by this change are:
ip_divert_cookie used by divert sockets ip_fw_fwd_addr used for transparent ip redirection last_pkt used by dynamic pipes in dummynet
Removal of the first two has been done by carrying the annotations into volatile structs prepended to the mbuf chains, and adding appropriate code to add/remove annotations in the routines which make use of them, i.e. ip_input(), ip_output(), tcp_input(), bdg_forward(), ether_demux(), ether_output_frame(), div_output().
On passing, remove a bug in divert handling of fragmented packet. Now it is the fragment at offset 0 which sets the divert status of the whole packet, whereas formerly it was the last incoming fragment to decide.
Removal of last_pkt required a change in the interface of ip_fw_chk() and dummynet_io(). On passing, use the same mechanism for dummynet annotations and for divert/forward annotations.
option IPFIREWALL_FORWARD is effectively useless, the code to implement it is very small and is now in by default to avoid the obfuscation of conditionally compiled code.
NOTES: * there is at least one global variable left, sro_fwd, in ip_output(). I am not sure if/how this can be removed.
* I have deliberately avoided gratuitous style changes in this commit to avoid cluttering the diffs. Minor stule cleanup will likely be necessary
* this commit only focused on the IP layer. I am sure there is a number of global variables used in the TCP and maybe UDP stack.
* despite the number of files touched, there are absolutely no API's or data structures changed by this commit (except the interfaces of ip_fw_chk() and dummynet_io(), which are internal anyways), so an MFC is quite safe and unintrusive (and desirable, given the improved readability of the code).
MFC after: 10 days
|
#
92723 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
87120 |
|
30-Nov-2001 |
ru |
- Make ip_rtaddr() global, and use it to look up the correct source address in icmp_reflect(). - Two new "struct icmpstat" members: icps_badaddr and icps_noroute.
PR: kern/31575 Obtained from: BSD/OS MFC after: 1 week
|
#
82884 |
|
03-Sep-2001 |
julian |
Patches from Keiichi SHIMA <keiichi@iij.ad.jp> to make ip use the standard protosw structure again.
Obtained from: Well, KAME I guess.
|
#
78094 |
|
11-Jun-2001 |
ume |
This is force commit to mention about previous commit.
- use 0/8 to specify interface index on multicast get/setsockopt - introduce ipstat.ips_badaddr
|
#
78064 |
|
11-Jun-2001 |
ume |
Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge.
TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT.
Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks
|
#
77574 |
|
01-Jun-2001 |
kris |
Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets. This closes a minor information leak which allows a remote observer to determine the rate at which the machine is generating packets, since the default behaviour is to increment a counter for each packet sent.
Reviewed by: -net Obtained from: OpenBSD
|
#
74454 |
|
19-Mar-2001 |
ru |
Invalidate cached forwarding route (ipforward_rt) whenever a new route is added to the routing table, otherwise we may end up using the wrong route when forwarding.
PR: kern/10778 Reviewed by: silence on -net
|
#
74362 |
|
16-Mar-2001 |
phk |
<sys/queue.h> makeover.
|
#
62587 |
|
04-Jul-2000 |
itojun |
sync with kame tree as of july00. tons of bug fixes/improvements.
API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)
|
#
60765 |
|
21-May-2000 |
jlemon |
Compute the checksum before handing the packet off to IPFilter.
Tested by: Cy Schubert <Cy.Schubert@uumail.gov.bc.ca>
|
#
55205 |
|
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
#
55009 |
|
22-Dec-1999 |
shin |
IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers.
Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
54175 |
|
05-Dec-1999 |
archie |
Miscellaneous fixes/cleanups relating to ipfw and divert(4):
- Implement 'ipfw tee' (finally) - Divert packets by calling new function divert_packet() directly instead of going through protosw[]. - Replace kludgey global variable 'ip_divert_port' with a function parameter to divert_packet() - Replace kludgey global variable 'frag_divert_port' with a function parameter to ip_reass() - style(9) fixes
Reviewed by: julian, green
|
#
52904 |
|
05-Nov-1999 |
shin |
KAME related header files additions and merges. (only those which don't affect c source files so much)
Reviewed by: cvs-committers Obtained from: KAME project
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
38513 |
|
24-Aug-1998 |
dfr |
Re-implement tcp and ip fragment reassembly to not store pointers in the ip header which can't work on alpha since pointers are too big.
Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
|
#
38482 |
|
23-Aug-1998 |
wollman |
Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.
|
#
37625 |
|
13-Jul-1998 |
bde |
Removed a bogus forward struct declaration.
Cleaned up ifdefs.
|
#
37409 |
|
06-Jul-1998 |
julian |
Support for IPFW based transparent forwarding. Any packet that can be matched by a ipfw rule can be redirected transparently to another port or machine. Redirection to another port mostly makes sense with tcp, where a session can be set up between a proxy and an unsuspecting client. Redirection to another machine requires that the other machine also be expecting to receive the forwarded packets, as their headers will not have been modified.
/sbin/ipfw must be recompiled!!!
Reviewed by: Peter Wemm <peter@freebsd.org> Submitted by: Chrisy Luke <chrisy@flix.net>
|
#
36767 |
|
08-Jun-1998 |
bde |
Fixed pedantic semantics errors (bitfields not of type int, signed int or unsigned int (this doesn't change the struct layout, size or alignment in any of the files changed in this commit, at least for gcc on i386's. Using bitfields of type u_char may affect size and alignment but not packing)).
|
#
36707 |
|
06-Jun-1998 |
julian |
clean up the changes made to ipfw over the last weeks (should make the ipfw lkm work again)
|
#
36678 |
|
05-Jun-1998 |
julian |
Reverse the default sense of the IPFW/DIVERT reinjection code so that the new behaviour is now default. Solves the "infinite loop in diversion" problem when more than one diversion is active. Man page changes follow.
The new code is in -stable as the NON default option.
|
#
36369 |
|
25-May-1998 |
julian |
Add optional code to change the way that divert and ipfw work together. Prior to this change, Accidental recursion protection was done by the diverted daemon feeding back the divert port number it got the packet on, as the port number on a sendto(). IPFW knew not to redivert a packet to this port (again). Processing of the ruleset started at the beginning again, skipping that divert port.
The new semantic (which is how we should have done it the first time) is that the port number in the sendto() is the rule number AFTER which processing should restart, and on a recvfrom(), the port number is the rule number which caused the diversion. This is much more flexible, and also more intuitive. If the user uses the same sockaddr received when resending, processing resumes at the rule number following that that caused the diversion. The user can however select to resume rule processing at any rule. (0 is restart at the beginning)
To enable the new code use
option IPFW_DIVERT_RESTART
This should become the default as soon as people have looked at it a bit
|
#
36194 |
|
19-May-1998 |
pb |
Move (private) struct ipflow out of ip_var.h, to reduce dependencies (for ipfw for example) on internal implementation details. Add $Id$ where missing.
|
#
36193 |
|
19-May-1998 |
dg |
Moved #define of IPFLOW_HASHBITS to ip_flow.c where I think it belongs.
|
#
36192 |
|
19-May-1998 |
dg |
Added fast IP forwarding code by Matt Thomas <matt@3am-software.com> via NetBSD, ported to FreeBSD by Pierre Beyssac <pb@fasterix.freenix.org> and minorly tweaked by me. This is a standard part of FreeBSD, but must be enabled with: "sysctl -w net.inet.ip.fastforwarding=1" ...and of course forwarding must also be enabled. This should probably be modified to use the zone allocator for speed and space efficiency. The current algorithm also appears to lose if the number of active paths exceeds IPFLOW_MAX (256), in which case it wastes lots of time trying to figure out which cache entry to drop.
|
#
29179 |
|
07-Sep-1997 |
bde |
Some staticized variables were still declared to be extern.
|
#
26113 |
|
25-May-1997 |
peter |
Connect the ipdivert div_usrreqs struct to the ip proto switch table
|
#
25201 |
|
27-Apr-1997 |
wollman |
The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in.
2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials.
3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed.
4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family.
5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem.
As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22900 |
|
18-Feb-1997 |
wollman |
Convert raw IP from mondo-switch-statement-from-Hell to pr_usrreqs. Collapse duplicates with udp_usrreq.c and tcp_usrreq.c (calling the generic routines in uipc_socket2.c and in_pcb.c). Calling sockaddr()_ or peeraddr() on a detached socket now traps, rather than harmlessly returning an error; this should never happen. Allow the raw IP buffer sizes to be controlled via sysctl.
|
#
22672 |
|
13-Feb-1997 |
wollman |
Provide PRC_IFDOWN and PRC_IFUP support for IP. Now, when an interface is administratively downed, all routes to that interface (including the interface route itself) which are not static will be deleted. When it comes back up, and addresses remaining will have their interface routes re-added. This solves the problem where, for example, an Ethernet interface is downed by traffic continues to flow by way of ARP entries.
|
#
21932 |
|
21-Jan-1997 |
wollman |
Count multicast packets received for groups of which we are not a member separately from generic ``can't forward'' packets. This would have helped me find the previous bug much faster.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
19669 |
|
12-Nov-1996 |
bde |
Forward-declare `struct inpcb' so that including this file doesn't cause lots of warnings.
Should be in 2.2. Previous version shouldn't have been in 2.2.
|
#
19622 |
|
11-Nov-1996 |
fenner |
Add the IP_RECVIF socket option, which supplies a packet's incoming interface using a sockaddr_dl.
Fix the other packet-information socket options (SO_TIMESTAMP, IP_RECVDSTADDR) to work for multicast UDP and raw sockets as well. (They previously only worked for unicast UDP).
|
#
19183 |
|
25-Oct-1996 |
fenner |
Don't allow reassembly to create packets bigger than IP_MAXPACKET, and count attempts to do so. Don't allow users to source packets bigger than IP_MAXPACKET. Make UDP length and ipovly's protocol length unsigned short.
Reviewed by: wollman Submitted by: (partly by) kml@nas.nasa.gov (Kevin Lahey)
|
#
19136 |
|
23-Oct-1996 |
wollman |
Give ip_len and ip_off more natural, unsigned types.
|
#
18940 |
|
15-Oct-1996 |
bde |
Forward-declared `struct route' for the KERNEL case so that <net/route.h> isn't a prerequisite.
Fixed style of ifdefs.
|
#
17072 |
|
10-Jul-1996 |
julian |
Adding changes to ipfw and the kernel to support ip packet diversion.. This stuff should not be too destructive if the IPDIVERT is not compiled in.. be aware that this changes the size of the ip_fw struct so ipfw needs to be recompiled to use it.. more changes coming to clean this up.
|
#
14824 |
|
26-Mar-1996 |
fenner |
Make rip_input() take the header length Move ipip_input() and rsvp_input() prototypes to ip_var.h Remove unused prototype for rip_ip_input() from ip_var.h Remove unused variable *opts from rip_output()
|
#
13765 |
|
30-Jan-1996 |
mpp |
Fix a bunch of spelling errors in the comment fields of a bunch of system include files.
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12635 |
|
05-Dec-1995 |
wollman |
Path MTU Discovery is now standard.
|
#
12296 |
|
14-Nov-1995 |
phk |
New style sysctl & staticize alot of stuff.
|
#
10942 |
|
21-Sep-1995 |
wollman |
Merge 4.4-Lite-2 by updating the version number.
Obtained from: 4.4BSD-Lite-2
|
#
10881 |
|
18-Sep-1995 |
wollman |
Initial back-end support for IP MTU discovery, gated on MTUDISC. The support for TCP has yet to be written.
|
#
9728 |
|
26-Jul-1995 |
wollman |
Fix test for determining when RSVP is inactive in a router. (In this case, multicast options are not passed to ip_mforward().) The previous version had a wrong test, thus causing RSVP mrouters to forward RSVP messages in violation of the spec.
|
#
9347 |
|
28-Jun-1995 |
dg |
Added function prototypes for ip_rsvp_vif_init, ip_rsvp_vif_done, and ip_rsvp_force_done.
|
#
9209 |
|
13-Jun-1995 |
wollman |
Kernel side of 3.5 multicast routing code, based on work by Bill Fenner and other work done here. The LKM support is probably broken, but it still compiles and will be fixed later.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
7083 |
|
16-Mar-1995 |
wollman |
This set of patches enables IP multicasting to work under FreeBSD. I am submitting them as context diffs for the following files:
sys/netinet/ip_mroute.c sys/netinet/ip_var.h sys/netinet/raw_ip.c usr.sbin/mrouted/igmp.c usr.sbin/mrouted/prune.c
The routine rip_ip_input in raw_ip.c is suggested by Mark Tinguely (tinguely@plains.nodak.edu). I have been running mrouted with these patches for over a week and nothing has seemed seriously wrong. It is being run in two places on our network as a tunnel on one and a subnet querier on the other. The only problem I have run into is that mrouted on the tunnel must start up last or the pruning isn't done correctly and multicast packets flood your subnets.
Submitted by: Soochon Radee <slr@mitre.org>
|
#
6362 |
|
14-Feb-1995 |
phk |
YPfix
|
#
2754 |
|
14-Sep-1994 |
wollman |
Shuffle some functions and variables around to make it possible for multicast routing to be implemented as an LKM. (There's still a bit of work to do in this area.)
|
#
2531 |
|
06-Sep-1994 |
wollman |
Initial get-the-easy-case-working upgrade of the multicast code to something more recent than the ancient 1.2 release contained in 4.4. This code has the following advantages as compared to previous versions (culled from the README file for the SunOS release):
- True multicast delivery - Configurable rate-limiting of forwarded multicast traffic on each physical interface or tunnel, using a token-bucket limiter. - Simplistic classification of packets for prioritized dropping. - Administrative scoping of multicast address ranges. - Faster detection of hosts leaving groups. - Support for multicast traceroute (code not yet available). - Support for RSVP, the Resource Reservation Protocol.
What still needs to be done:
- The multicast forwarder needs testing. - The multicast routing daemon needs to be ported. - Network interface drivers need to have the `#ifdef MULTICAST' goop ripped out of them. - The IGMP code should probably be bogon-tested.
Some notes about the porting process:
In some cases, the Berkeley people decided to incorporate functionality from later releases of the multicast code, but then had to do things differently. As a result, if you look at Deering's patches, and then look at our code, it is not always obvious whether the patch even applies. Let the reader beware.
I ran ip_mroute.c through several passes of `unifdef' to get rid of useless grot, and to permanently enable the RSVP support, which we will include as standard.
Ported by: Garrett Wollman Submitted by: Steve Deering and Ajit Thyagarajan (among others)
|
#
2169 |
|
21-Aug-1994 |
paul |
Made idempotent.
Submitted by: Paul
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|