#
60d8dbbe |
|
18-Jan-2024 |
Kristof Provost <kp@FreeBSD.org> |
netinet: add a probe point for IP, IP6, ICMP, ICMP6, UDP and TCP stats counters When debugging network issues one common clue is an unexpectedly incrementing error counter. This is helpful, in that it gives us an idea of what might be going wrong, but often these counters may be incremented in different functions. Add a static probe point for them so that we can use dtrace to get futher information (e.g. a stack trace). For example: dtrace -n 'mib:ip:count: { printf("%d", arg0); stack(); }' This can be disabled by setting the following kernel option: options KDTRACE_NO_MIB_SDT Reviewed by: gallatin, tuexen (previous version), gnn (previous version) Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D43504
|
#
ffeab76b |
|
26-Jan-2024 |
Kristof Provost <kp@FreeBSD.org> |
pfil: PFIL_PASS never frees the mbuf pfil hooks (i.e. firewalls) may pass, modify or free the mbuf passed to them. (E.g. when rejecting a packet, or when gathering up packets for reassembly). If the hook returns PFIL_PASS the mbuf must still be present. Assert this in pfil_mem_common() and ensure that ipfilter follows this convention. pf and ipfw already did. Similarly, if the hook returns PFIL_DROPPED or PFIL_CONSUMED the mbuf must have been freed (or now be owned by the firewall for further processing, like packet scheduling or reassembly). This allows us to remove a few extraneous NULL checks. Suggested by: tuexen Reviewed by: tuexen, zlei Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D43617
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
5ab15157 |
|
24-May-2023 |
Doug Rabson <dfr@FreeBSD.org> |
netinet*: Fix redirects for connections from localhost Redirect rules use PFIL_IN and PFIL_OUT events to allow packet filter rules to change the destination address and port for a connection. Typically, the rule triggers on an input event when a packet is received by a router and the destination address and/or port is changed to implement the redirect. When a reply packet on this connection is output to the network, the rule triggers again, reversing the modification. When the connection is initiated on the same host as the packet filter, it is initially output via lo0 which queues it for input processing. This causes an input event on the lo0 interface, allowing redirect processing to rewrite the destination and create state for the connection. However, when the reply is received, no corresponding output event is generated; instead, the packet is delivered to the higher level protocol (e.g. tcp or udp) without reversing the redirect, the reply is not matched to the connection and the packet is dropped (for tcp, a connection reset is also sent). This commit fixes the problem by adding a second packet filter call in the input path. The second call happens right before the handoff to higher level processing and provides the missing output event to allow the redirect's reply processing to perform its rewrite. This extra processing is disabled by default and can be enabled using pfilctl: pfilctl link -o pf:default-out inet-local pfilctl link -o pf:default-out6 inet6-local PR: 268717 Reviewed-by: kp, melifaro MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40256
|
#
bb55bb17 |
|
06-Mar-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
inet6: Include if_private.h in one more netstack file ip6_input() and ip6_destroy() both directly reference ifnet members. This file was missed in 3d0d5b21 Fixes: 3d0d5b21 ("IfAPI: Explicitly include <net/if_private.h>...") Sponsored by: Juniper Networks, Inc.
|
#
fcb3f813 |
|
03-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet*: remove PRC_ constants and streamline ICMP processing In the original design of the network stack from the protocol control input method pr_ctlinput was used notify the protocols about two very different kinds of events: internal system events and receival of an ICMP messages from outside. These events were coded with PRC_ codes. Today these methods are removed from the protosw(9) and are isolated to IPv4 and IPv6 stacks and are called only from icmp*_input(). The PRC_ codes now just create a shim layer between ICMP codes and errors or actions taken by protocols. - Change ipproto_ctlinput_t to pass just pointer to ICMP header. This allows protocols to not deduct it from the internal IP header. - Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer. It has all the information needed to the protocols. In the structure, change ip6c_finaldst fields to sockaddr_in6. The reason is that icmp6_input() already has this address wrapped in sockaddr, and the protocols want this address as sockaddr. - For UDP tunneling control input, as well as for IPSEC control input, change the prototypes to accept a transparent union of either ICMP header pointer or struct ip6ctlparam pointer. - In icmp_input() and icmp6_input() do only validation of ICMP header and count bad packets. The translation of ICMP codes to errors/actions is done by protocols. - Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap, inet6ctlerrmap arrays. - In protocol ctlinput methods either trust what icmp_errmap() recommend, or do our own logic based on the ICMP header. Differential revision: https://reviews.freebsd.org/D36731
|
#
53807a8a |
|
03-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet*: use sparse C99 initializer for inetctlerrmap and mark those PRC_* codes, that are used. The rest are dead code. This is not a functional change, but illustrative to make easier review of following changes.
|
#
46ddeb6b |
|
03-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet6: retire ip6protosw.h The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with the former removed in f0ffb944d25 in 2001 and the latter survived until today. It has been reduced down to only one useful declaration that moves to ip6_var.h Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36726
|
#
24b96f35 |
|
03-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet*: move ipproto_register() and co to ip_var.h and ip6_var.h This is a FreeBSD KPI and belongs to private header not netinet/in.h. Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36723
|
#
dda6376b |
|
08-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: employ newly added pfil_mbuf_{in,out} where approriate Reviewed by: glebius Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D36454
|
#
223a73a1 |
|
06-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: remove stale altq_input reference Code setting it was removed in: commit 325fab802e1f40c992141f945d0788c0edfdb1a4 Author: Eric van Gyzen <vangyzen@FreeBSD.org> Date: Tue Dec 4 23:46:43 2018 +0000 altq: remove ALTQ3_COMPAT code Reviewed by: glebius, kp Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D36471
|
#
6080e073 |
|
17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
ip6_input: explicitly include <sys/eventhandler.h> On most architectures/kernels it was included implicitly, but powerpc MPC85XX got broken. Fixes: 81a34d374ed6e5a7b14f24583bc8e3abfdc66306
|
#
81a34d37 |
|
17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protosw: retire pr_drain and use EVENTHANDLER(9) directly The method was called for two different conditions: 1) the VM layer is low on pages or 2) one of UMA zones of mbuf allocator exhausted. This change 2) into a new event handler, but all affected network subsystems modified to subscribe to both, so this change shall not bring functional changes under different low memory situations. There were three subsystems still using pr_drain: TCP, SCTP and frag6. The latter had its protosw entry for the only reason to register its pr_drain method. Reviewed by: tuexen, melifaro Differential revision: https://reviews.freebsd.org/D36164
|
#
78b1fc05 |
|
17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protosw: separate pr_input and pr_ctlinput out of protosw The protosw KPI historically has implemented two quite orthogonal things: protocols that implement a certain kind of socket, and protocols that are IPv4/IPv6 protocol. These two things do not make one-to-one correspondence. The pr_input and pr_ctlinput methods were utilized only in IP protocols. This strange duality required IP protocols that doesn't have a socket to declare protosw, e.g. carp(4). On the other hand developers of socket protocols thought that they need to define pr_input/pr_ctlinput always, which lead to strange dead code, e.g. div_input() or sdp_ctlinput(). With this change pr_input and pr_ctlinput as part of protosw disappear and IPv4/IPv6 get their private single level protocol switch table ip_protox[] and ip6_protox[] respectively, pointing at array of ipproto_input_t functions. The pr_ctlinput that was used for control input coming from the network (ICMP, ICMPv6) is now represented by ip_ctlprotox[] and ip6_ctlprotox[]. ipproto_register() becomes the only official way to register in the table. Those protocols that were always static and unlikely anybody is interested in making them loadable, are now registered by ip_init(), ip6_init(). An IP protocol that considers itself unloadable shall register itself within its own private SYSINIT(). Reviewed by: tuexen, melifaro Differential revision: https://reviews.freebsd.org/D36157
|
#
50fa27e7 |
|
09-Jul-2022 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
netinet6: fix interface handling for loopback traffic Currently, processing of IPv6 local traffic is partially broken: link-local connection fails and global unicast connect() takes 3 seconds to complete. This happens due to the combination of multiple factors. IPv6 code passes original interface "origifp" when passing traffic via loopack to retain the scope that is mandatory for the correct hadling of link-local traffic. First problem is that the logic of passing source interface is not working correcly for TCP connections, resulting in passing "origifp" on the first 2 connection attempts and lo0 on the subsequent ones. Second problem is that source address validation logic skips its checks iff the source interface is loopback, which doesn't cover "origifp" case. More detailed description is available at https://reviews.freebsd.org/D35732 Fix the first problem by untangling&simplifying ifp/origifp logic. Fix the second problem by switching source address validation check to using M_LOOP mbuf flag instead of interface type. PR: 265089 Reviewed by: ae, bz(previous version) Differential Revision: https://reviews.freebsd.org/D35732 MFC after: 2 weeks
|
#
0ed72537 |
|
04-Jul-2022 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
netinet6: perform out-of-bounds check for loX multicast statistics Currently, some per-mbuf multicast statistics is stored in the per-interface ip6stat.ip6s_m2m[] array of size 32 (IP6S_M2MMAX). Check that loopback ifindex falls within 0.. IP6S_M2MMAX-1 range to avoid silent data corruption. The latter cat happen with large number of VNETs. Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D35715 MFC after: 2 weeks
|
#
6890b588 |
|
17-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbuf: improve sbcreatecontrol() o Constify memory pointer. Make length unsigned. o Make it never fail with M_WAITOK and assert that length is sane.
|
#
b46667c6 |
|
17-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbuf: merge two versions of sbcreatecontrol() into one No functional change.
|
#
89128ff3 |
|
03-Jan-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protocols: init with standard SYSINIT(9) or VNET_SYSINIT The historical BSD network stack loop that rolls over domains and over protocols has no advantages over more modern SYSINIT(9). While doing the sweep, split global and per-VNET initializers. Getting rid of pr_init allows to achieve several things: o Get rid of ifdef's that protect against double foo_init() when both INET and INET6 are compiled in. o Isolate initializers statically to the module they init. o Makes code easier to understand and maintain. Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D33537
|
#
1817be48 |
|
12-Nov-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add net.inet6.ip6.source_address_validation Drop packets arriving from the network that have our source IPv6 address. If maliciously crafted they can create evil effects like an RST exchange between two of our listening TCP ports. Such packets just can't be legitimate. Enable the tunable by default. Long time due for a modern Internet host. Reviewed by: melifaro, donner, kp Differential revision: https://reviews.freebsd.org/D32915
|
#
7045b160 |
|
28-Jul-2021 |
Roy Marples <roy@marples.name> |
socket: Implement SO_RERROR SO_RERROR indicates that receive buffer overflows should be handled as errors. Historically receive buffer overflows have been ignored and programs could not tell if they missed messages or messages had been truncated because of overflows. Since programs historically do not expect to get receive overflow errors, this behavior is not the default. This is really really important for programs that use route(4) to keep in sync with the system. If we loose a message then we need to reload the full system state, otherwise the behaviour from that point is undefined and can lead to chasing bogus bug reports. Reviewed by: philip (network), kbowling (transport), gbe (manpages) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26652
|
#
b1d63265 |
|
08-Mar-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Flush remaining routes from the routing table during VNET shutdown. Summary: This fixes rtentry leak for the cloned interfaces created inside the VNET. PR: 253998 Reported by: rashey at superbox.pl MFC after: 3 days Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown). Thus, any route table operations are too late to schedule. As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`. It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish. Test Plan: ``` set_skip:set_skip_group_lo -> passed [0.053s] tail -n 200 /var/log/messages | grep rtentry ``` Reviewers: #network, kp, bz Reviewed By: kp Subscribers: imp, ae Differential Revision: https://reviews.freebsd.org/D29116
|
#
8268d82c |
|
15-Feb-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Remove per-packet ifa refcounting from IPv6 fast path. Currently ip6_input() calls in6ifa_ifwithaddr() for every local packet, in order to check if the target ip belongs to the local ifa in proper state and increase its counters. in6ifa_ifwithaddr() references found ifa. With epoch changes, both `ip6_input()` and all other current callers of `in6ifa_ifwithaddr()` do not need this reference anymore, as epoch provides stability guarantee. Given that, update `in6ifa_ifwithaddr()` to allow it to return ifa without referencing it, while preserving option for getting referenced ifa if so desired. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28648
|
#
924d1c9a |
|
08-Feb-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Revert "SO_RERROR indicates that receive buffer overflows should be handled as errors." Wrong version of the change was pushed inadvertenly. This reverts commit 4a01b854ca5c2e5124958363b3326708b913af71.
|
#
4a01b854 |
|
07-Feb-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
SO_RERROR indicates that receive buffer overflows should be handled as errors. Historically receive buffer overflows have been ignored and programs could not tell if they missed messages or messages had been truncated because of overflows. Since programs historically do not expect to get receive overflow errors, this behavior is not the default. This is really really important for programs that use route(4) to keep in sync with the system. If we loose a message then we need to reload the full system state, otherwise the behaviour from that point is undefined and can lead to chasing bogus bug reports.
|
#
662c1305 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
74ff87cd |
|
06-Dec-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Update comment. Update the comment related to SIIT and v4mapped addresses being rejected by us when coming from the wire given we have supported IPv6-only kernels for a few years now. See also draft-itojun-v6ops-v4mapped-harmful. Suggested by: melifaro MFC after: 2 weeks
|
#
b745e762 |
|
06-Dec-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
ip6_input: remove redundant v4mapped check In ip6_input() we apply the same v4mapped address check twice. The only case which skipps the first one is M_FASTFWD_OURS which should have passed the check on the firstinput pass and passed the firewall. Remove the 2nd redundant check. Reviewed by: kp, melifaro MFC after: 2 weeks Sponsored by: Netflix (originally) Differential Revision: https://reviews.freebsd.org/D22462
|
#
a4adf6cc |
|
30-Nov-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Fix m_pullup() problem after removing PULLDOWN_TESTs and KAME EXT_*macros. r354748-354750 replaced the KAME macros with m_pulldown() calls. Contrary to the rest of the network stack m_len checks before m_pulldown() were not put in placed (see r354748). Put these m_len checks in place for now (to go along with the style of the network stack since the initial commits). These are not put in for performance but to avoid an error scenario (even though it also will help performance at the moment as it avoid allocating an extra mbuf; not because of the unconditional function call). The observed error case went like this: (1) an mbuf with M_EXT arrives and we call m_pullup() unconditionally on it. (2) m_pullup() will call m_get() unless the requested length is larger than MHLEN (in which case it'll m_freem() the perfectly fine mbuf) and migrate the requested length of data and pkthdr into the new mbuf. (3) If m_get() succeeds, a further m_pullup() call going over MHLEN will fail. This was observed with failing auto-configuration as an RA packet of 200 bytes exceeded MHLEN and the m_pullup() called from nd6_ra_input() dropped the mbuf. (Re-)adding the m_len checks before m_pullup() calls avoids this problems with mbufs using external storage for now. MFC after: 3 weeks Sponsored by: Netflix
|
#
a61b5cfb |
|
15-Nov-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
netinet6: Remove PULLDOWN_TESTs. Remove the KAME introduced PULLDOWN_TESTs which did not even have a compile-time option in sys/conf to turn them on for a custom kernel build. They made the code a lot harder to read or more complicated in a few cases. Convert the IP6_EXTHDR_CHECK() calls into FreeBSD looking code. Rather than throwing the packet away if it would not fit the KAME mbuf expectations, convert the macros to m_pullup() calls. Do not do any extra manual conditional checks upfront as to whether the m_len would suffice (*), simply let m_pullup() do its work (incl. an early check). Remove extra m_pullup() calls where earlier in the function or the only caller has already done the pullup. Discussed with: rwatson (*) Reviewed by: ae MFC after: 8 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22334
|
#
a8fe77d8 |
|
12-Nov-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
netinet*: update *mp to pass the proper value back In ip6_[direct_]input() we are looping over the extension headers to deal with the next header. We pass a pointer to an mbuf pointer to the handling functions. In certain cases the mbuf can be updated there and we need to pass the new one back. That missing in dest6_input() and route6_input(). In tcp6_input() we should also update it before we call tcp_input(). In addition to that mark the mbuf NULL all the times when we return that we are done with handling the packet and no next header should be checked (IPPROTO_DONE). This will eventually allow us to assert proper behaviour and catch the above kind of errors more easily, expecting *mp to always be set. This change is extracted from a larger patch and not an exhaustive change across the entire stack yet. PR: 240135 Reported by: prabhakar.lakhera gmail.com MFC after: 3 weeks Sponsored by: Netflix
|
#
503f4e47 |
|
07-Nov-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
netinet*: variable cleanup In preparation for another change factor out various variable cleanups. These mainly include: (1) do not assign values to variables during declaration: this makes the code more readable and does allow for better grouping of variable declarations, (2) do not assign values to variables before need; e.g., if a variable is only used in the 2nd half of a function and we have multiple return paths before that, then do not set it before it is needed, and (3) try to avoid assigning the same value multiple times. MFC after: 3 weeks Sponsored by: Netflix
|
#
67a10c46 |
|
21-Oct-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
frag6: fix vnet teardown leak When shutting down a VNET we did not cleanup the fragmentation hashes. This has multiple problems: (1) leak memory but also (2) leak on the global counters, which might eventually lead to a problem on a system starting and stopping a lot of vnets and dealing with a lot of IPv6 fragments that the counters/limits would be exhausted and processing would no longer take place. Unfortunately we do not have a useable variable to indicate when per-VNET initialization of frag6 has happened (or when destroy happened) so introduce a boolean to flag this. This is needed here as well as it was in r353635 for ip_reass.c in order to avoid tripping over the already destroyed locks if interfaces go away after the frag6 destroy. While splitting things up convert the TRY_LOCK to a LOCK operation in now frag6_drain_one(). The try-lock was derived from a manual hand-rolled implementation and carried forward all the time. We no longer can afford not to get the lock as that would mean we would continue to leak memory. Assert that all the buckets are empty before destroying to lock to ensure long-term stability of a clean shutdown. Reported by: hselasky Reviewed by: hselasky MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22054
|
#
e7a541b0 |
|
19-Sep-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
When processing an incoming IPv6 packet over the loopback interface which contains Hop-by-Hop options, the mbuf chain is potentially changed in ip6_hopopts_input(), called by ip6_input_hbh(). This can happen, because of the the use of IP6_EXTHDR_CHECK, which might call m_pullup(). So provide the updated pointer back to the called of ip6_input_hbh() to avoid using a freed mbuf chain in`ip6_input()`. Reviewed by: markj@ MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21664
|
#
0ecd976e |
|
02-Aug-2019 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
IPv6 cleanup: kernel Finish what was started a few years ago and harmonize IPv6 and IPv4 kernel names. We are down to very few places now that it is feasible to do the change for everything remaining with causing too much disturbance. Remove "aliases" for IPv6 names which confusingly could indicate that we are talking about a different data structure or field or have two fields, one for each address family. Try to follow common conventions used in FreeBSD. * Rename sin6p to sin6 as that is how it is spelt in most places. * Remove "aliases" (#defines) for: - in6pcb which really is an inpcb and nothing separate - sotoin6pcb which is sotoinpcb (as per above) - in6p_sp which is inp_sp - in6p_flowinfo which is inp_flow * Try to use ia6 for in6_addr rather than in6p. * With all these gone also rename the in6p variables to inp as that is what we call it in most of the network stack including parts of netinet6. The reasons behind this cleanup are that we try to further unify netinet and netinet6 code where possible and that people will less ignore one or the other protocol family when doing code changes as they may not have spotted places due to different names for the same thing. No functional changes. Discussed with: tuexen (SCTP changes) MFC after: 3 months Sponsored by: Netflix
|
#
b252313f |
|
31-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
New pfil(9) KPI together with newborn pfil API and control utility. The KPI have been reviewed and cleansed of features that were planned back 20 years ago and never implemented. The pfil(9) internals have been made opaque to protocols with only returned types and function declarations exposed. The KPI is made more strict, but at the same time more extensible, as kernel uses same command structures that userland ioctl uses. In nutshell [KA]PI is about declaring filtering points, declaring filters and linking and unlinking them together. New [KA]PI makes it possible to reconfigure pfil(9) configuration: change order of hooks, rehook filter from one filtering point to a different one, disconnect a hook on output leaving it on input only, prepend/append a filter to existing list of filters. Now it possible for a single packet filter to provide multiple rulesets that may be linked to different points. Think of per-interface ACLs in Cisco or Juniper. None of existing packet filters yet support that, however limited usage is already possible, e.g. default ruleset can be moved to single interface, as soon as interface would pride their filtering points. Another future feature is possiblity to create pfil heads, that provide not an mbuf pointer but just a memory pointer with length. That would allow filtering at very early stages of a packet lifecycle, e.g. when packet has just been received by a NIC and no mbuf was yet allocated. Differential Revision: https://reviews.freebsd.org/D18951
|
#
62484790 |
|
14-Aug-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Restore ability to send ICMP and ICMPv6 redirects. It was lost when tryforward appeared. Now ip[6]_tryforward will be enabled only when sending redirects for corresponding IP version is disabled via sysctl. Otherwise will be used default forwarding function. PR: 221137 Submitted by: mckay@ MFC after: 2 weeks
|
#
6040822c |
|
30-Jul-2018 |
Alan Somers <asomers@FreeBSD.org> |
Make timespecadd(3) and friends public The timespecadd(3) family of macros were imported from NetBSD back in r35029. However, they were initially guarded by #ifdef _KERNEL. In the meantime, we have grown at least 28 syscalls that use timespecs in some way, leading many programs both inside and outside of the base system to redefine those macros. It's better just to make the definitions public. Our kernel currently defines two-argument versions of timespecadd and timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define three-argument versions. Solaris also defines a three-argument version, but only in its kernel. This revision changes our definition to match the common three-argument version. Bump _FreeBSD_version due to the breaking KPI change. Discussed with: cem, jilles, ian, bde Differential Revision: https://reviews.freebsd.org/D14725
|
#
4f6c66cc |
|
23-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
UDP: further performance improvements on tx Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps Single stream throughput increases from 910kpps to 1.18Mpps Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg - Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg - Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg - Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409
|
#
d7c5a620 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
ifnet: Replace if_addr_lock rwlock with epoch + mutex Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366
|
#
effaab88 |
|
23-Mar-2018 |
Kristof Provost <kp@FreeBSD.org> |
netpfil: Introduce PFIL_FWD flag Forwarded packets passed through PFIL_OUT, which made it difficult for firewalls to figure out if they were forwarding or producing packets. This in turn is an issue for pf for IPv6 fragment handling: it needs to call ip6_output() or ip6_forward() to handle the fragments. Figuring out which was difficult (and until now, incorrect). Having pfil distinguish the two removes an ugly piece of code from pf. Introduce a new variant of the netpfil callbacks with a flags variable, which has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if a packet is forwarded. Reviewed by: ae, kevans Differential Revision: https://reviews.freebsd.org/D13715
|
#
68e0e5a6 |
|
05-Feb-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Modify ip6_get_prevhdr() to be able use it safely. Instead of returning pointer to the previous header, return its offset. In frag6_input() use m_copyback() and determined offset to store next header instead of accessing to it by pointer and assuming that the memory is contiguous. In rip6_input() use offset returned by ip6_get_prevhdr() instead of calculating it from pointers arithmetic, because IP header can belong to another mbuf in the chain. Reported by: Maxime Villard <max at m00nbsd dot net> Reviewed by: kp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D14158
|
#
2164def6 |
|
29-Jan-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Do not skip scope zone violation check, when mbuf has M_FASTFWD_OURS flag. When mbuf has M_FASTFWD_OURS flag, this means that a destination address is our local, but we still need to pass scope zone violation check, because protocol level expects that IPv6 link-local addresses have embedded scope zone indexes. This should fix the problem, when ipfw is used to forward packets to local address and source address of a packet is IPv6 LLA. Reported by: sbruno MFC after: 3 weeks
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
06193f0b |
|
07-Nov-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Use hardware timestamps to report packet timestamps for SO_TIMESTAMP and other similar socket options. Provide new control message SCM_TIME_INFO to supply information about timestamp. Currently it indicates that the timestamp was hardware-assisted and high-precision, for software timestamps the message is not returned. Reserved fields are added to ABI to report additional info about it, it is expected that raw hardware clock value might be useful for some applications. Reviewed by: gallatin (previous version), hselasky Sponsored by: Mellanox Technologies MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D12638
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
fcf59617 |
|
06-Feb-2017 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352
|
#
339efd75 |
|
16-Jan-2017 |
Maxim Sobolev <sobomax@FreeBSD.org> |
Add a new socket option SO_TS_CLOCK to pick from several different clock sources to return timestamps when SO_TIMESTAMP is enabled. Two additional clock sources are: o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME); o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC). In addition to this, this option provides unified interface to get bintime (equivalent of using SO_BINTIME), except it also supported with IPv6 where SO_BINTIME has never been supported. The long term plan is to depreciate SO_BINTIME and move everything to using SO_TS_CLOCK. Idea for this enhancement has been briefly discussed on the Net session during dev summit in Ottawa last June and the general input was positive. This change is believed to benefit network benchmarks/profiling as well as other scenarios where precise time of arrival measurement is necessary. There are two regression test cases as part of this commit: one extends unix domain test code (unix_cmsg) to test new SCM_XXX types and another one implementis totally new test case which exchanges UDP packets between two processes using both conventional methods (i.e. calling clock_gettime(2) before recv(2) and after send(2)), as well as using setsockopt()+recv() in receive path. The resulting delays are checked for sanity for all supported clock types. Reviewed by: adrian, gnn Differential Revision: https://reviews.freebsd.org/D9171
|
#
ad9f4d6a |
|
19-Dec-2016 |
Andrey V. Elsukov <ae@FreeBSD.org> |
ip[6]_tryforward does inbound and outbound packet firewall processing. This can lead to change of mbuf pointer (packet filter could do m_pullup(), NAT, etc). Also in case of change of destination address, tryforward can decide that packet should be handled by local system. In this case modified mbuf can be returned to the ip[6]_input(). To handle this correctly, check M_FASTFWD_OURS flag after return from ip[6]_tryforward. And if it is present, update variables that depend from mbuf pointer and skip another inbound firewall processing. No objection from: #network MFC after: 3 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D8764
|
#
8a030e9c |
|
12-Dec-2016 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Modify IPv6 statistic accounting in ip6_input(). Add rcvif local variable to keep inbound interface pointer. Count ifs6_in_discard errors in all "goto bad" cases. Now it will count errors even if mbuf was freed. Modify all places where m->m_pkthdr.rcvif is used to use local rcvif variable. Obtained from: Yandex LLC MFC after: 1 month
|
#
5a1842a2 |
|
12-Dec-2016 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add ip6_tryforward() - a run to completion forwarding implementation for IPv6. It gets performance benefits from reduced number of checks. It doesn't copy mbuf to be able send ICMPv6 error message, because it keeps mbuf unchanged until the moment, when the route decision has been made. It doesn't do IPsec checks, and when some IPsec security policies present, ip6_input() uses normal slow path. Reviewed by: bz, gnn Obtained from: Yandex LLC MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D8527
|
#
3e146575 |
|
21-Oct-2016 |
Michael Tuexen <tuexen@FreeBSD.org> |
Make ICMPv6 hard error handling for TCP consistent with the ICMPv4 handling. Ensure that: * Protocol unreachable errors are handled by indicating ECONNREFUSED to the TCP user for both IPv4 and IPv6. These were ignored for IPv6. * Communication prohibited errors are handled by indicating ECONNREFUSED to the TCP user for both IPv4 and IPv6. These were ignored for IPv6. * Hop Limited exceeded errors are handled by indicating EHOSTUNREACH to the TCP user for both IPv4 and IPv6. For IPv6 the TCP connected was dropped but errno wasn't set. Reviewed by: gallatin, rrs MFC after: 1 month Sponsored by: Netflix Differential Revision: 7904
|
#
7aeccebc |
|
15-Jul-2016 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add net.inet6.ip6.intr_queue_maxlen sysctl. It can be used to change netisr queue limit for IPv6 at runtime. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC
|
#
89856f7e |
|
21-Jun-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747
|
#
4c105402 |
|
08-Jun-2016 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Cleanup unneded include "opt_ipfw.h". It was used for conditional build IPFIREWALL_FORWARD support. But IPFIREWALL_FORWARD option was removed a long time ago.
|
#
484149de |
|
03-Jun-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Introduce a per-VNET flag to enable/disable netisr prcessing on that VNET. Add accessor functions to toggle the state per VNET. The base system (vnet0) will always enable itself with the normal registration. We will share the registered protocol handlers in all VNETs minimising duplication and management. Upon disabling netisr processing for a VNET drain the netisr queue from packets for that VNET. Update netisr consumers to (de)register on a per-VNET start/teardown using VNET_SYS(UN)INIT functionality. The change should be transparent for non-VIMAGE kernels. Reviewed by: gnn (, hiren) Obtained from: projects/vnet MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6691
|
#
3f58662d |
|
01-Jun-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
The pr_destroy field does not allow us to run the teardown code in a specific order. VNET_SYSUNINITs however are doing exactly that. Thus remove the VIMAGE conditional field from the domain(9) protosw structure and replace it with VNET_SYSUNINITs. This also allows us to change some order and to make the teardown functions file local static. Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use internally. Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g., for pfil consumers (firewalls), partially for this commit and for others to come. Reviewed by: gnn, tuexen (sctp), jhb (kernel.h) Obtained from: projects/vnet MFC after: 2 weeks X-MFC: do not remove pr_destroy Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6652
|
#
9901091e |
|
22-Mar-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Mfp4 @180378: Factor out nd6 and in6_attach initialization to their own files. Also move destruction into those files though still called from the central initialization. Sponsored by: CK Software GmbH Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D5033
|
#
0ea826e0 |
|
22-Jan-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4 @180892: With pr_destroy being gone, call ip6_destroy from an ordered NET_SYSUNINT. Make ip6_destroy() static as well. Sponsored by: The FreeBSD Foundation
|
#
1f12da0e |
|
22-Jan-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Just checkpoint the WIP in order to be able to make the tree update easier. Note: this is currently not in a usable state as certain teardown parts are not called and the DOMAIN rework is missing. More to come soon and find its way to head. Obtained from: P4 //depot/user/bz/vimage/... Sponsored by: The FreeBSD Foundation
|
#
ef91a976 |
|
25-Nov-2015 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Overhaul if_enc(4) and make it loadable in run-time. Use hhook(9) framework to achieve ability of loading and unloading if_enc(4) kernel module. INET and INET6 code on initialization registers two helper hooks points in the kernel. if_enc(4) module uses these helper hook points and registers its hooks. IPSEC code uses these hhook points to call helper hooks implemented in if_enc(4).
|
#
aaa46574 |
|
06-Nov-2015 |
Adrian Chadd <adrian@FreeBSD.org> |
[netinet6]: Create a new IPv6 netisr which expects the frames to have been verified. This is required for fragments and encapsulated data (eg tunneling) to be redistributed to the RSS bucket based on the eventual IPv6 header and protocol (TCP, UDP, etc) header. * Add an mbuf tag with the state of IPv6 options parsing before the frame is queued into the direct dispatch handler; * Continue processing and complete the frame reception in the correct RSS bucket / netisr context. Testing results are in the phabricator review. Differential Revision: https://reviews.freebsd.org/D3563 Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn>
|
#
68bb8d62 |
|
06-Sep-2015 |
Adrian Chadd <adrian@FreeBSD.org> |
Add support for receiving flowtype, flowid and RSS bucket information as part of recvmsg(). Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3562
|
#
0be18915 |
|
29-Aug-2015 |
Adrian Chadd <adrian@FreeBSD.org> |
Implement RSS hashing/re-hashing for IPv6 ingress packets. This mirrors the basic IPv4 implementation - IPv6 packets under RSS now are checked for a correct RSS hash and if one isn't provided, it's done in software. This only handles the initial receive - it doesn't yet handle reinjecting / rehashing packets after being decapsulated from various tunneling setups. That'll come in some follow-up work. For non-RSS users, this is almost a giant no-op. It does change a couple of ipv6 methods to use const mbuf * instead of mbuf * but it doesn't have any functional changes. So, the following now occurs: * If the NIC doesn't do any RSS hashing, it's all done in software. Single-queue, non-RSS NICs will now have the RX path distributed into multiple receive netisr queues. * If the NIC provides the wrong hash (eg only IPv6 hash when we needed an IPv6 TCP hash, or IPv6 UDP hash when we expected IPv6 hash) then the hash is recalculated. * .. if the hash is recalculated, it'll end up being injected into the correct netisr queue for v6 processing. Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3504
|
#
cc0a3c8c |
|
29-Jul-2015 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Convert in_ifaddr_lock and in6_ifaddr_lock to rmlock. Both are used to protect access to IP addresses lists and they can be acquired for reading several times per packet. To reduce lock contention it is better to use rmlock here. Reviewed by: gnn (previous version) Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3149
|
#
8f1beb88 |
|
04-Mar-2015 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Fix deadlock in IPv6 PCB code. When several threads are trying to send datagram to the same destination, but fragmentation is disabled and datagram size exceeds link MTU, ip6_output() calls pfctlinput2(PRC_MSGSIZE). It does notify all sockets wanted to know MTU to this destination. And since all threads hold PCB lock while sending, taking the lock for each PCB in the in6_pcbnotify() leads to deadlock. RFC 3542 p.11.3 suggests notify all application wanted to receive IPV6_PATHMTU ancillary data for each ICMPv6 packet too big message. But it doesn't require this, when we don't receive ICMPv6 message. Change ip6_notify_pmtu() function to be able use it directly from ip6_output() to notify only one socket, and to notify all sockets when ICMPv6 packet too big message received. PR: 197059 Differential Revision: https://reviews.freebsd.org/D1949 Reviewed by: no objection from #network Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC
|
#
3e88eb90 |
|
08-Nov-2014 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Remove ip6_getdstifaddr() and all functions to work with auxiliary data. It isn't safe to keep unreferenced ifaddrs. Use in6ifa_ifwithaddr() to determine ifaddr corresponding to destination address. Since currently we keep addresses with embedded scope zone, in6ifa_ifwithaddr is called with zero zoneid and marked with XXX. Also remove route and lle lookups from ip6_input. Use in6ifa_ifwithaddr() instead. Sponsored by: Yandex LLC
|
#
8c3cfe0b |
|
04-Nov-2014 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Hide 'struct rtentry' and all its macro inside new header: net/route_internal.h The goal is to make its opaque for all code except route/rtsock and proto domain _rmx.
|
#
8f5a8818 |
|
07-Aug-2014 |
Kevin Lo <kevlo@FreeBSD.org> |
Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb
|
#
36d55f0f |
|
26-Apr-2014 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Unify sa_equal() macro usage. MFC after: 2 weeks
|
#
52c57247 |
|
17-Apr-2014 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Remove unused variable. PR: 173521 MFC after: 1 week Sponsored by: Yandex LLC
|
#
e4c77ca0 |
|
13-Feb-2014 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Drop packets to multicast address whose scop field contains the reserved value 0. MFC after: 1 week Sponsored by: Yandex LLC
|
#
5d6d7e75 |
|
07-Feb-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
o Revamp API between flowtable and netinet, netinet6. - ip_output() and ip_output6() simply call flowtable_lookup(), passing mbuf and address family. That's the only code under #ifdef FLOWTABLE in the protocols code now. o Revamp statistics gathering and export. - Remove hand made pcpu stats, and utilize counter(9). - Snapshot of statistics is available via 'netstat -rs'. - All sysctls are moved into net.flowtable namespace, since spreading them over net.inet isn't correct. o Properly separate at compile time INET and INET6 parts. o General cleanup. - Remove chain of multiple flowtables. We simply have one for IPv4 and one for IPv6. - Flowtables are allocated in flowtable.c, symbols are static. - With proper argument to SYSINIT() we no longer need flowtable_ready. - Hash salt doesn't need to be per-VNET. - Removed rudimentary debugging, which use quite useless in dtrace era. The runtime behavior of flowtable shouldn't be changed by this commit. Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
54366c0b |
|
25-Nov-2013 |
Attilio Rao <attilio@FreeBSD.org> |
- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip
|
#
76039bc8 |
|
26-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
7caf4ab7 |
|
15-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Utilize counter(9) to accumulate statistics on interface addresses. Add four counters to struct ifaddr. This kills '+=' on a variables shared between processors for every packet. - Nuke struct if_data from struct ifaddr. - In ip_input() do not put a reference on ifaddr, instead update statistics right now in place and do IN_IFADDR_RUNLOCK(). These removes atomic(9) for every packet. [1] - To properly support NET_RT_IFLISTL sysctl used by getifaddrs(3), in rtsock.c fill if_data fields using counter_u64_fetch(). - Accidentially fix bug in COMPAT_32 version of NET_RT_IFLISTL, which took if_data not from the ifaddr, but from ifaddr's ifnet. [2] Submitted by: melifaro [1], pluknet[2] Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
ca695e08 |
|
15-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove useless check of ia6 against NULL, right after dereferencing it.
|
#
4d3dfd45 |
|
13-Sep-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
Unregister inet/inet6 pfil hooks on vnet destroy. Discussed with: andre Approved by: re (rodrigc)
|
#
57f60867 |
|
25-Aug-2013 |
Mark Johnston <markj@FreeBSD.org> |
Implement the ip, tcp, and udp DTrace providers. The probe definitions use dynamic translation so that their arguments match the definitions for these providers in Solaris and illumos. Thus, existing scripts for these providers should work unmodified on FreeBSD. Tested by: gnn, hiren MFC after: 1 month
|
#
a786f679 |
|
09-Jul-2013 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Migrate structs ip6stat, icmp6stat and rip6stat to PCPU counters.
|
#
eca4d720 |
|
16-Apr-2013 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Use IP6S_M2MMAX macro.
|
#
9cb8d207 |
|
09-Apr-2013 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats. MFC after: 1 week
|
#
10e5acc3 |
|
15-Mar-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Use m_getcl() instead of hand allocating. - Do not calculate constant length values at run time, CTASSERT() their sanity. - Remove superfluous cleaning of mbuf fields after allocation. - Replace compat macros with function calls. Sponsored by: Nginx, Inc.
|
#
aa481811 |
|
14-Mar-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Use m_getcl() instead of hand made allocation. Sponsored by: Nginx, Inc.
|
#
68eba526 |
|
15-Dec-2012 |
Andrey V. Elsukov <ae@FreeBSD.org> |
In additional to the tailq of IPv6 addresses add the hash table. For now use 256 buckets and fnv_hash function. Use xor'ed 32-bit s6_addr32 parts of in6_addr structure as a hash key. Update in6_localip and in6_is_addr_deprecated to use hash table for fastest lookup. Sponsored by: Yandex LLC Discussed with: dwmalone, glebius, bz
|
#
eb1b1807 |
|
05-Dec-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
|
#
73cb2f38 |
|
15-Nov-2012 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Reduce the overhead of locking, use IF_AFDATA_RLOCK() when we are doing simple lookups. Sponsored by: Yandex LLC MFC after: 1 week
|
#
ffdbf9da |
|
01-Nov-2012 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Remove the recently added sysctl variable net.pfil.forward. Instead, add protocol specific mbuf flags M_IP_NEXTHOP and M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup only when this flag is set. Suggested by: andre
|
#
c1de64a4 |
|
25-Oct-2012 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Remove the IPFIREWALL_FORWARD kernel option and make possible to turn on the related functionality in the runtime via the sysctl variable net.pfil.forward. It is turned off by default. Sponsored by: Yandex LLC Discussed with: net@ MFC after: 2 weeks
|
#
b36dcb9a |
|
12-Jun-2012 |
Michael Tuexen <tuexen@FreeBSD.org> |
Deliver IPV6_TCLASS, IPV6_HOPLIMIT and IPV6_PKTINFO cmsgs (if requested) on IPV6 sockets, which have been marked to be not IPV6_V6ONLY, for each received IPV4 packet. MFC after: 3 days
|
#
3df0e439 |
|
03-Jun-2012 |
Maksim Yevmenkin <emax@FreeBSD.org> |
Plug reference leak. Interface routes are refcounted as packets move through the stack, and there's garbage collection tied to it so that route changes can safely propagate while traffic is flowing. In our setup, we weren't changing or deleting any routes, but the refcounting logic in ip6_input() was wrong and caused a reference leak on every inbound V6 packet. This eventually caused a 32bit overflow, and the resulting 0 value caused the garbage collection to run on the active route. That then snowballed into the panic. Reviewed by: scottl MFC after: 3 days
|
#
ae145050 |
|
24-May-2012 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4 bz_ipv6_fast: Factor out Hop-By-Hop option processing. It's still not heavily used, it reduces the footprint of ip6_input() and makes ip6_input() more readable. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days
|
#
d3443481 |
|
24-May-2012 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4 bz_ipv6_fast: Hide the ip6aux functions. The only one referenced outside ip6_input.c is not compiled in yet (__notyet__) in route6.c (r235954). We do have accessor functions that should be used. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days X-MFC: KPI?
|
#
dead1956 |
|
02-Mar-2012 |
Hiroki Sato <hrs@FreeBSD.org> |
Allow to configure net.inet6.ip6.{accept_rtadv,no_radr} by the loader tunables as well because they have to be configured before interface initialization for AF_INET6.
|
#
81d5d46b |
|
03-Feb-2012 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Add multi-FIB IPv6 support to the core network stack supplementing the original IPv4 implementation from r178888: - Use RT_DEFAULT_FIB in the IPv4 implementation where noticed. - Use rt*fib() KPI with explicit RT_DEFAULT_FIB where applicable in the NFS code. - Use the new in6_rt* KPI in TCP, gif(4), and the IPv6 network stack where applicable. - Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally prevent multiple initializations of callouts in in6_inithead(). - Use wrapper functions where needed to preserve the current KPI to ease MFCs. Use BURN_BRIDGES to indicate expected future cleanup. - Fix (related) comments (both technical or style). - Convert to rtinit() where applicable and only use custom loops where currently not possible otherwise. - Multicast group, most neighbor discovery address actions and faith(4) are locked to the default FIB. Individual IPv6 addresses will only appear in the default FIB, however redirect information and prefixes of connected subnets are automatically propagated to all FIBs by default (mimicking IPv4 behavior as closely as possible). Sponsored by: Cisco Systems, Inc.
|
#
137f91e8 |
|
05-Jan-2012 |
John Baldwin <jhb@FreeBSD.org> |
Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks
|
#
8a006adb |
|
20-Aug-2011 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Add support for IPv6 to ipfw fwd: Distinguish IPv4 and IPv6 addresses and optional port numbers in user space to set the option for the correct protocol family. Add support in the kernel for carrying the new IPv6 destination address and port. Add support to TCP and UDP for IPv6 and fix UDP IPv4 to not change the address in the IP header. Add support for IPv6 forwarding to a non-local destination. Add a regession test uitilizing VIMAGE to check all 20 possible combinations I could think of. Obtained from: David Dolson at Sandvine Incorporated (original version for ipfw fwd IPv6 support) Sponsored by: Sandvine Incorporated PR: bin/117214 MFC after: 4 weeks Approved by: re (kib)
|
#
86905204 |
|
08-Jun-2011 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Add the missing call to ip6_ipsec_filtertunnel() to be able to control whether decapsulated IPsec packets will be passed to pfil again depending on the setting of the net.ip6.ipsec6.filtertunnel sysctl. PR: kern/157670 Submitted by: Manuel Kasper (mk neon1.net) MFC after: 2 weeks
|
#
6d79f3f6 |
|
27-Nov-2010 |
Rebecca Cran <brucec@FreeBSD.org> |
Fix more continuous/contiguous typos (cf. r215955)
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
1b48d245 |
|
02-Sep-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4 CH=183052 183053 183258: In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number(*) and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling. (*) the functions have been without any in-tree consumer since the initial introducation, so this is considered save. Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically. Reviewed by: philip, will MFC after: 1 week
|
#
101235dc |
|
21-Jul-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Since r186119 IP6 input counters for octets and packets were not working anymore. In addition more checks and operations were missing. In case lla_lookup results in a match, get the ifaddr to update the statistics counters, and check that the address is neither tentative, duplicate or otherwise invalid before accepting the packet. If ok, record the address information in the mbuf. [ as is done in case lla_lookup does not return a result and we go through the FIB ]. Reported by: remko Tested by: remko MFC after: 2 weeks
|
#
83e711ec |
|
16-May-2010 |
Kip Macy <kmacy@FreeBSD.org> |
allocate ipv6 flows from the ipv6 flow zone reported by: rrs@ MFC after: 3 days
|
#
94162961 |
|
13-May-2010 |
Kip Macy <kmacy@FreeBSD.org> |
do a proper fix Pointed out by: np@ MFC after: 3 days
|
#
fc21c49a |
|
13-May-2010 |
Kip Macy <kmacy@FreeBSD.org> |
fix compile error on some builds by doing the equivalent of an "extern VNET_DEFINE" without "__used" MFC after: 3 days
|
#
69381083 |
|
10-May-2010 |
Kip Macy <kmacy@FreeBSD.org> |
boot time size the flowtable MFC after: 3 days
|
#
77931dd5 |
|
09-May-2010 |
Kip Macy <kmacy@FreeBSD.org> |
Add flowtable support to IPv6 Tested by: qingli@ Reviewed by: qingli@ MFC after: 3 days
|
#
480d7c6c |
|
06-May-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFC r207369: MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH
|
#
82cea7e6 |
|
29-Apr-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days
|
#
2ae7ec29 |
|
07-Feb-2010 |
Julian Elischer <julian@FreeBSD.org> |
MFC of 197952 and 198075 Virtualize the pfil hooks so that different jails may chose different packet filters. ALso allows ipfw to be enabled on on ejail and disabled on another. In 8.0 it's a global setting. and Unbreak the VIMAGE build with IPSEC, broken with r197952 by virtualizing the pfil hooks. For consistency add the V_ to virtualize the pfil hooks in here as well.
|
#
3745cc73 |
|
08-Jan-2010 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Replace several instances of 'if (!a & b)' with 'if (!(a &b))' in order to silence newer GCC versions.
|
#
0b4b0b0f |
|
10-Oct-2009 |
Julian Elischer <julian@FreeBSD.org> |
Virtualize the pfil hooks so that different jails may chose different packet filters. ALso allows ipfw to be enabled on on ejail and disabled on another. In 8.0 it's a global setting. Sitting aroung in tree waiting to commit for: 2 months MFC after: 2 months
|
#
a283298c |
|
12-Sep-2009 |
Hiroki Sato <hrs@FreeBSD.org> |
Improve flexibility of receiving Router Advertisement and automatic link-local address configuration: - Convert a sysctl net.inet6.ip6.accept_rtadv to one for the default value of a per-IF flag ND6_IFF_ACCEPT_RTADV, not a global knob. The default value of the sysctl is 0. - Add a new per-IF flag ND6_IFF_AUTO_LINKLOCAL and convert a sysctl net.inet6.ip6.auto_linklocal to one for its default value. The default value of the sysctl is 1. - Make ND6_IFF_IFDISABLED more robust. It can be used to disable IPv6 functionality of an interface now. - Receiving RA is allowed if ip6_forwarding==0 *and* ND6_IFF_ACCEPT_RTADV is set on that interface. The former condition will be revisited later to support a "host + router" box like IPv6 CPE router. The current behavior is compatible with the older releases of FreeBSD. - The ifconfig(8) now supports these ND6 flags as well as "nud", "prefer_source", and "disabled" in ndp(8). The ndp(8) now supports "auto_linklocal". Discussed with: bz and jinmei Reviewed by: bz MFC after: 3 days
|
#
4090e9b2 |
|
30-Aug-2009 |
Qing Li <qingli@FreeBSD.org> |
MFC r196569 When multiple interfaces exist in the system, with each interface having an IPv6 address assigned to it, and if an incoming packet received on one interface has a packet destination address that belongs to another interface, the routing table is consulted to determine how to reach this packet destination. Since the packet destination is an interface address, the route table will return a host route with the loopback interface as rt_ifp. The input code must recognize this fact, instead of using the loopback interface, the input code performs a search to find the right interface that owns the given IPv6 address. Reviewed by: bz, gnn, kmacy Approved by: re
|
#
7bcee7f3 |
|
26-Aug-2009 |
Qing Li <qingli@FreeBSD.org> |
When multiple interfaces exist in the system, with each interface having an IPv6 address assigned to it, and if an incoming packet received on one interface has a packet destination address that belongs to another interface, the routing table is consulted to determine how to reach this packet destination. Since the packet destination is an interface address, the route table will return a host route with the loopback interface as rt_ifp. The input code must recognize this fact, instead of using the loopback interface, the input code performs a search to find the right interface that owns the given IPv6 address. Reviewed by: bz, gnn, kmacy MFC after: immediately
|
#
530c0060 |
|
01-Aug-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
|
#
0a4747d4 |
|
20-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Garbage collect vnet module registrations that have neither constructors nor destructors, as there's no actual work to do. In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added. Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6. Reviewed by: bz Approved by: re (vimage blanket)
|
#
1e77c105 |
|
16-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
|
#
eddfbb76 |
|
14-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
d1da0a06 |
|
25-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Add address list locking for in6_ifaddrhead/ia_link: as with locking for in_ifaddrhead, we stick with an rwlock for the time being, which we will revisit in the future with a possible move to rmlocks. Some pieces of code require significant further reworking to be safe from all classes of writer-writer races. Reviewed by: bz MFC after: 6 weeks
|
#
80af0152 |
|
24-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Convert netinet6 to using queue(9) rather than hand-crafted linked lists for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt the code styles and conventions present in netinet where possible. Reviewed by: gnn, bz MFC after: 6 weeks (possibly not MFCable?)
|
#
8c0fec80 |
|
23-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
|
#
8d8bc018 |
|
08-Jun-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
|
#
bc29160d |
|
08-Jun-2009 |
Marko Zec <zec@FreeBSD.org> |
Introduce an infrastructure for dismantling vnet instances. Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
|
#
d825c793 |
|
01-Jun-2009 |
Marko Zec <zec@FreeBSD.org> |
V_loif is not an array but a pure pointer, so treat it as such. Reviewed by: bz Approved by: julian (mentor)
|
#
d4b5cae4 |
|
01-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Reimplement the netisr framework in order to support parallel netisr threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz
|
#
29dc7bc6 |
|
27-May-2009 |
Bruce M Simpson <bms@FreeBSD.org> |
Merge final round of MLD changes from p4: ip6_input.c, in6.h: * Add netinet6-specific mbuf flag M_RTALERT_MLD, shadowing M_PROTO6. * Always set this flag if HBH Router Alert option is present for MLD, even when not forwarding. icmp6.c: * In icmp6_input(), spell m->m_pkthdr.rcvif as ifp to be consistent. * Use scope ID for verifying input. Do not apply SSM filters here, no inpcb. * Check for M_RTALERT_MLD when validating MLD traffic, as we can't see IPv6 hop options outside of ip6_input(). in6_mcast.c: * Use KAME scope/zone ID in in6_multi. * Update net.inet6.ip6.mcast.filters implementation to use scope IDs for comparisons. * Fix scope ID treatment in multicast socket option processing. Scope IDs passed in from userland will be ignored as other less ambiguous APIs exist for specifying the link. * Tighten userland input checks in IPv6 SSM delta and full-state ops. * Source filter embedded scope IDs need to be revisited, for now just clear them and ignore them on input. * Adapt KAME behaviour of looking up the scope ID in the default zone for multicast leaves, when the interface is ambiguous. mld6.c: * Tighten origin checks on MLD traffic as per RFC3810 Section 6.2: * ip6_src MAY be the unspecified address for MLDv1 reports. * ip6_src MAY have link-local address scope for MLDv1 reports, MLDv1 queries, and MLDv2 queries. * Perform address field validation *before* accepting queries. * Use KAME scope/zone ID in query/report processing. * Break const correctness for mld_v1_input_report(), mld_v1_input_query() as we temporarily modify the input mbuf chain. * Clear the scope ID before handoff to userland MLD daemon. * Fix MLDv1 old querier present timer processing. With the protocol defaults, hosts should revert to MLDv2 after 260s. * Add net.inet6.mld.v1enable sysctl, default to on. ifmcstat.c: * Use sysctl by default; -K requests kvm(3) if so compiled. mld.4: * Connect man page to build. Tested using PCS.
|
#
f6dfe47a |
|
30-Apr-2009 |
Marko Zec <zec@FreeBSD.org> |
Permit buiding kernels with options VIMAGE, restricted to only a single active network stack instance. Turning on options VIMAGE at compile time yields the following changes relative to default kernel build: 1) V_ accessor macros for virtualized variables resolve to structure fields via base pointers, instead of being resolved as fields in global structs or plain global variables. As an example, V_ifnet becomes: options VIMAGE: ((struct vnet_net *) vnet_net)->_ifnet default build: vnet_net_0._ifnet options VIMAGE_GLOBALS: ifnet 2) INIT_VNET_* macros will declare and set up base pointers to be used by V_ accessor macros, instead of resolving to whitespace: INIT_VNET_NET(ifp->if_vnet); becomes struct vnet_net *vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET]; 3) Memory for vnet modules registered via vnet_mod_register() is now allocated at run time in sys/kern/kern_vimage.c, instead of per vnet module structs being declared as globals. If required, vnet modules can now request the framework to provide them with allocated bzeroed memory by filling in the vmi_size field in their vmi_modinfo structures. 4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are extended to hold a pointer to the parent vnet. options VIMAGE builds will fill in those fields as required. 5) curvnet is introduced as a new global variable in options VIMAGE builds, always pointing to the default and only struct vnet. 6) struct sysctl_oid has been extended with additional two fields to store major and minor virtualization module identifiers, oid_v_subs and oid_v_mod. SYSCTL_V_* family of macros will fill in those fields accordingly, and store the offset in the appropriate vnet container struct in oid_arg1. In sysctl handlers dealing with virtualized sysctls, the SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target variable and make it available in arg1 variable for further processing. Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have been deleted. Reviewed by: bz, rwatson Approved by: julian (mentor)
|
#
33cde130 |
|
29-Apr-2009 |
Bruce M Simpson <bms@FreeBSD.org> |
Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit: import from p4 bms_netdev. Summary of changes: * Connect netinet6/in6_mcast.c to build. The legacy KAME KPIs are mostly preserved. * Eliminate now dead code from ip6_output.c. Don't do mbuf bingo, we are not going to do RFC 2292 style CMSG tricks for multicast options as they are not required by any current IPv6 normative reference. * Refactor transports (UDP, raw_ip6) to do own mcast filtering. SCTP, TCP unaffected by this change. * Add ip6_msource, in6_msource structs to in6_var.h. * Hookup mld_ifinfo state to in6_ifextra, allocate from domifattach path. * Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced. Kernel consumers which need this should use in6m_lookup(). * Refactor IPv6 socket group memberships to use a vector (like IPv4). * Update ifmcstat(8) for IPv6 SSM. * Add witness lock order for IN6_MULTI_LOCK. * Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths. * Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup. * Update carp(4) for new IPv6 SSM KPIs. * Virtualize ip6_mrouter socket. Changes mostly localized to IPv6 MROUTING. * Don't do a local group lookup in MROUTING. * Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge(). * Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode. * Bump __FreeBSD_version to 800084. * Update UPDATING. NOTE WELL: * This code hasn't been tested against real MLDv2 queriers (yet), although the on-wire protocol has been verified in Wireshark. * There are a few unresolved issues in the socket layer APIs to do with scope ID propagation. * There is a LOR present in ip6_output()'s use of in6_setscope() which needs to be resolved. See comments in mld6.c. This is believed to be benign and can't be avoided for the moment without re-introducing an indirect netisr. This work was mostly derived from the IGMPv3 implementation, and has been sponsored by a third party.
|
#
3f795dd3 |
|
23-Apr-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Compare protosw pointer with NULL. MFC after: 1 month
|
#
bfe1aba4 |
|
10-Apr-2009 |
Marko Zec <zec@FreeBSD.org> |
Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)
|
#
1ed81b73 |
|
06-Apr-2009 |
Marko Zec <zec@FreeBSD.org> |
First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)
|
#
33553d6e |
|
27-Feb-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.
|
#
09f8c3ff |
|
01-Feb-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Remove the single global unlocked route cache ip6_forward_rt from the inet6 stack along with statistics and make sure we properly free the rt in all cases. While the current situation is not better performance wise it prevents panics seen more often these days. After more inet6 and ipsec cleanup we should be able to improve the situation again passing the rt to ip6_forward directly. Leave the ip6_forward_rt entry in struct vinet6 but mark it for removal. PR: kern/128247, kern/131038 MFC after: 25 days Committed from: Bugathon #6 Tested by: Denis Ahrens <denis@h3q.com> (different initial version)
|
#
351c4745 |
|
30-Jan-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Remove 4 entirely unsued ip6 variables. Leave then in struct vinet6 to not break the ABI with kernel modules but mark them for removal so we can do it in one batch when the time is right. MFC after: 1 month
|
#
f5d35259 |
|
21-Dec-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Correct variable name in comment. MFC after: 4 weeks
|
#
099d0bd3 |
|
18-Dec-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Only unlock the llentry if it is actually valid. Reported by: ed
|
#
fc384fa5 |
|
15-Dec-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Another step assimilating IPv[46] PCB code - directly use the inpcb names rather than the following IPv6 compat macros: in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag, in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and sotoin6pcb(). Apart from removing duplicate code in netipsec, this is a pure whitespace, not a functional change. Discussed with: rwatson Reviewed by: rwatson (version before review requested changes) MFC after: 4 weeks (set the timer and see then)
|
#
6e6b3f7c |
|
14-Dec-2008 |
Qing Li <qingli@FreeBSD.org> |
This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion
|
#
1b193af6 |
|
13-Dec-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Second round of putting global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL. Put the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely. Sponsored by: The FreeBSD Foundation
|
#
385195c0 |
|
10-Dec-2008 |
Marko Zec <zec@FreeBSD.org> |
Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
4b79449e |
|
02-Dec-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
|
#
44e33a07 |
|
19-Nov-2008 |
Marko Zec <zec@FreeBSD.org> |
Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
8b615593 |
|
02-Oct-2008 |
Marko Zec <zec@FreeBSD.org> |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
603724d3 |
|
17-Aug-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
#
48d48eb9 |
|
16-Aug-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Fix a regression introduced in r179289 splitting up ip6_savecontrol() into v4-only vs. v6-only inp_flags processing. When ip6_savecontrol_v4() is called from ip6_savecontrol() we were not passing back the **mp thus the information will be missing in userland. Istead of going with a *** as suggested in the PR we are returning **mp now and passing in the v4only flag as a pointer argument. PR: kern/126349 Reviewed by: rwatson, dwmalone
|
#
59dd72d0 |
|
03-Jul-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Remove NETISR_MPSAFE, which allows specific netisr handlers to be directly dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific netisr handlers to always be dispatched via a queue (deferred). Mark the usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly acquire Giant in those handlers. Previously, any netisr handler not marked NETISR_MPSAFE would necessarily run deferred and with Giant acquired. This change removes Giant scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows non-MPSAFE handlers to continue to force deferred dispatch so as to avoid lock order reversals between their acqusition of Giant and any calling context. It is likely we will be able to remove NETISR_FORCEQUEUE once IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no longer be supported. Reviewed by: bz MFC after: 1 month X-MFC note: We can't remove NETISR_MPSAFE from stable/7 for KPI reasons, but the rest can go back.
|
#
aaa37a7e |
|
03-Jul-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Remove GIANT_REQUIRED from IPv6 input, forward, and frag6 code. The frag6 code is believed to be MPSAFE, and leaving aside the IPv6 route cache in forwarding, Giant appears not to adequately synchronize the data structures in the input or forwarding paths.
|
#
0a2fe173 |
|
02-Jul-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Set the IPv6 netisr handler as NETISR_MPSAFE on the basis that, despite there still being some well-known races in mld6 and nd6, running with Giant over the netisr handler provides little or not additional synchronization that might cause mld6 and nd6 to behave better.
|
#
9a38ba81 |
|
24-May-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Factor out the v4-only vs. the v6-only inp_flags processing in ip6_savecontrol in preparation for udp_append() to no longer need an WLOCK as we will no longer be modifying socket options. Requested by: rwatson Reviewed by: gnn MFC after: 10 days
|
#
9233d8f3 |
|
08-Jan-2008 |
David E. O'Brien <obrien@FreeBSD.org> |
un-__P()
|
#
b48287a3 |
|
10-Dec-2007 |
David E. O'Brien <obrien@FreeBSD.org> |
Clean up VCS Ids.
|
#
2a463222 |
|
05-Jul-2007 |
Xin LI <delphij@FreeBSD.org> |
Space cleanup Approved by: re (rwatson)
|
#
1272577e |
|
05-Jul-2007 |
Xin LI <delphij@FreeBSD.org> |
ANSIfy[1] plus some style cleanup nearby. Discussed with: gnn, rwatson Submitted by: Karl Sj?dahl - dunceor <dunceor gmail com> [1] Approved by: re (rwatson)
|
#
b2630c29 |
|
02-Jul-2007 |
George V. Neville-Neil <gnn@FreeBSD.org> |
Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC. Approved by: re Sponsored by: Secure Computing
|
#
2cb64cb2 |
|
01-Jul-2007 |
George V. Neville-Neil <gnn@FreeBSD.org> |
Commit IPv6 support for FAST_IPSEC to the tree. This commit includes only the kernel files, the rest of the files will follow in a second commit. Reviewed by: bz Approved by: re Supported by: Secure Computing
|
#
7eefde2c |
|
14-May-2007 |
JINMEI Tatuya <jinmei@FreeBSD.org> |
handle IPv6 router alert option contained in an incoming packet per option value so that unrecognized options are ignored as specified in RFC2711. (packets containing an MLD router alert option are passed to the upper layer as before). Approved by: gnn (mentor), ume (mentor)
|
#
6be2e366 |
|
24-Feb-2007 |
Bruce M Simpson <bms@FreeBSD.org> |
Make IPv6 multicast forwarding dynamically loadable from a GENERIC kernel. It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko, if and only if IPv6 support is enabled for loadable modules. Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).
|
#
1d54aa3b |
|
11-Dec-2006 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4: 92972, 98913 + one more change In ip6_sprintf no longer use and return one of eight static buffers for printing/logging ipv6 addresses. The caller now has to hand in a sufficiently large buffer as first argument.
|
#
43bc7a9c |
|
04-Aug-2006 |
Brooks Davis <brooks@FreeBSD.org> |
With exception of the if_name() macro, all definitions in net_osdep.h were unused or already in if_var.h so add if_name() to if_var.h and remove net_osdep.h along with all references to it. Longer term we may want to kill off if_name() entierly since all modern BSDs have if_xname variables rendering it unnecessicary.
|
#
656faadc |
|
12-May-2006 |
Max Laier <mlaier@FreeBSD.org> |
Remove ip6fw. Since ipfw has full functional IPv6 support now and - in contrast to ip6fw - is properly lockes, it is time to retire ip6fw.
|
#
604afec4 |
|
01-Feb-2006 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Somewhat re-factor the read/write locking mechanism associated with the packet filtering mechanisms to use the new rwlock(9) locking API: - Drop the variables stored in the phil_head structure which were specific to conditions and the home rolled read/write locking mechanism. - Drop some includes which were used for condition variables - Drop the inline functions, and convert them to macros. Also, move these macros into pfil.h - Move pfil list locking macros intp phil.h as well - Rename ph_busy_count to ph_nhooks. This variable will represent the number of IN/OUT hooks registered with the pfil head structure - Define PFIL_HOOKED macro which evaluates to true if there are any hooks to be ran by pfil_run_hooks - In the IP/IP6 stacks, change the ph_busy_count comparison to use the new PFIL_HOOKED macro. - Drop optimization in pfil_run_hooks which checks to see if there are any hooks to be ran, and returns if not. This check is already performed by the IP stacks when they call: if (!PFIL_HOOKED(ph)) goto skip_hooks; - Drop in assertion which makes sure that the number of hooks never drops below 0 for good measure. This in theory should never happen, and if it does than there are problems somewhere - Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep - Drop variables which support home rolled read/write locking mechanism from the IPFW firewall chain structure. - Swap out the read/write firewall chain lock internal to use the rwlock(9) API instead of our home rolled version - Convert the inlined functions to macros Reviewed by: mlaier, andre, glebius Thanks to: jhb for the new locking API
|
#
411babc6 |
|
25-Jan-2006 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
don't embed scope id before running packet filters. Reported by: YAMAMOTO Takashi <yamt__at__mwd.biglobe.ne.jp> Obtained from: NetBSD MFC after: 1 week
|
#
5b27b045 |
|
19-Oct-2005 |
SUZUKI Shinsuke <suz@FreeBSD.org> |
supported an ndp command suboption to disable IPv6 in the given interface Obtained from: KAME Reviewd by: ume, gnn MFC after: 2 week
|
#
a1f7e5f8 |
|
24-Jul-2005 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
scope cleanup. with this change - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current *BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the *BSD kernel cannot reject this attempt. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Obtained from: KAME
|
#
18b35df8 |
|
20-Jul-2005 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
update comments: - RFC2292bis -> RFC3542 - typo fixes Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME
|
#
6c011e4d |
|
15-Mar-2005 |
Sam Leffler <sam@FreeBSD.org> |
correct bounds check Noticed by: Coverity Prevent analysis tool
|
#
caf43b02 |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for license, minor formatting changes, separate for KAME
|
#
f45cd79a |
|
19-Oct-2004 |
Andre Oppermann <andre@FreeBSD.org> |
Be more careful to only index valid IP protocols and be more verbose with comments.
|
#
d6a8d588 |
|
28-Sep-2004 |
Max Laier <mlaier@FreeBSD.org> |
Add an additional struct inpcb * argument to pfil(9) in order to enable passing along socket information. This is required to work around a LOR with the socket code which results in an easy reproducible hard lockup with debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do so later. The missing piece is to turn the filter locking into a leaf lock and will follow in a seperate (later) commit. This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in forseeable future. Suggested by: rwatson A lot of work by: csjp (he'd be even more helpful w/o mentor-reviews ;) Reviewed by: rwatson, csjp Tested by: -pf, -ipfw, LINT, csjp and myself MFC after: 3 days LOR IDs: 14 - 17 (not fixed yet)
|
#
c21fd232 |
|
27-Aug-2004 |
Andre Oppermann <andre@FreeBSD.org> |
Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack. If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active. Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
|
#
c415679d |
|
22-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Remove in6_prefix.[ch] and the contained router renumbering capability. The prefix management code currently resides in nd6, leaving only the unused router renumbering capability in the in6_prefix files. Removing it will make it easier for us to provide locking for the remainder of IPv6 by reducing the number of objects requiring synchronized access. This functionality has also been removed from NetBSD and OpenBSD. Submitted by: George Neville-Neil <gnn at neville-neil.com> Discussed with/approved by: suz, keiichi at kame.net, core at kame.net
|
#
1f44b0a1 |
|
14-Aug-2004 |
David Malone <dwmalone@FreeBSD.org> |
Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD have already done this, so I have styled the patch on their work: 1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID. 2) named the sysctl net.inet.ip.random_id 3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns. The sysctl defaults to 0 (sequential IP IDs). Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months
|
#
02b199f1 |
|
13-Jun-2004 |
Max Laier <mlaier@FreeBSD.org> |
Link ALTQ to the build and break with ABI for struct ifnet. Please recompile your (network) modules as well as any userland that might make sense of sizeof(struct ifnet). This does not change the queueing yet. These changes will follow in a seperate commit. Same with the driver changes, which need case by case evaluation. __FreeBSD_version bump will follow. Tested-by: (i386)LINT
|
#
3c751c1b |
|
02-Jun-2004 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
do not check super user privilege in ip6_savecontrol. It is meaningless and can even be harmful. Obtained from: KAME MFC after: 3 days
|
#
f36cfd49 |
|
07-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
|
#
43eb694a |
|
02-Mar-2004 |
Max Laier <mlaier@FreeBSD.org> |
Move PFIL_HOOKS and ipfw past the scope checks to allow easy redirection to linklocal. Obtained from: OpenBSD Reviewed by: ume Approved by: bms(mentor)
|
#
48850f29 |
|
02-Mar-2004 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
scope awareness of ff01:: is not merged, yet. So, clear embeded form of scopeid for ff01:: for now. Pointed out by: mlaier
|
#
cfcea119 |
|
01-Mar-2004 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- reject incoming packets to an interface-local multicast address from the wire. - added a generic scope check, and removed checks for loopback src/dst addresses. Obtained from: KAME
|
#
efddf5c6 |
|
13-Feb-2004 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
supported IPV6_RECVPATHMTU socket option. Obtained from: KAME
|
#
26d02ca7 |
|
20-Nov-2003 |
Andre Oppermann <andre@FreeBSD.org> |
Remove RTF_PRCLONING from routing table and adjust users of it accordingly. The define is left intact for ABI compatibility with userland. This is a pre-step for the introduction of tcp_hostcache. The network stack remains fully useable with this change. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
|
#
7902224c |
|
08-Nov-2003 |
Sam Leffler <sam@FreeBSD.org> |
o add a flags parameter to netisr_register that is used to specify whether or not the isr needs to hold Giant when running; Giant-less operation is also controlled by the setting of debug_mpsafenet o mark all netisr's except NETISR_IP as needing Giant o add a GIANT_REQUIRED assertion to the top of netisr's that need Giant o pickup Giant (when debug_mpsafenet is 1) inside ip_input before calling up with a packet o change netisr handling so swi_net runs w/o Giant; instead we grab Giant before invoking handlers based on whether the handler needs Giant o change netisr handling so that netisr's that are marked MPSAFE may have multiple instances active at a time o add netisr statistics for packets dropped because the isr is inactive Supported by: FreeBSD Foundation
|
#
8d996f28 |
|
31-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
initialize in6_tmpaddrtimer_ch. Obtained from: KAME
|
#
7fc91b3f |
|
30-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
add management part of address selection policy described in RFC3484. Obtained from: KAME
|
#
11de19f4 |
|
28-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
ip6_savecontrol() argument is redundant
|
#
1410779a |
|
28-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
hide m_tag, again. Requested by: sam
|
#
b2667576 |
|
28-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
make sure to accept only IPv6 packet. Obtained from: KAME
|
#
2a5aafce |
|
28-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
cleanup use of m_tag. Obtained from: KAME
|
#
f95d4633 |
|
24-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Switch Advanced Sockets API for IPv6 from RFC2292 to RFC3542 (aka RFC2292bis). Though I believe this commit doesn't break backward compatibility againt existing binaries, it breaks backward compatibility of API. Now, the applications which use Advanced Sockets API such as telnet, ping6, mld6query and traceroute6 use RFC3542 API. Obtained from: KAME
|
#
9a4f9608 |
|
21-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- change scope to zone. - change node-local to interface-local. - better error handling of address-to-scope mapping. - use in6_clearscope(). Obtained from: KAME
|
#
31b1bfe1 |
|
17-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- add dom_if{attach,detach} framework. - transition to use ifp->if_afdata. Obtained from: KAME
|
#
e3124327 |
|
16-Oct-2003 |
Sam Leffler <sam@FreeBSD.org> |
fix horribly botched MFp4 merge
|
#
3c92002f |
|
16-Oct-2003 |
Sam Leffler <sam@FreeBSD.org> |
pfil hooks can modify packet contents so check if the destination address has been changed when PFIL_HOOKS is enabled and, if it has, arrange for the proper action by ip*_forward. Submitted by: Pyun YongHyeon Supported by: FreeBSD Foundation
|
#
020a816f |
|
10-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
fixed an endian bug on fragment header scanning Obtained from: KAME
|
#
953ad2fb |
|
10-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
nuke SCOPEDROUTING. Though it was there for a long time, it was never enabled.
|
#
7efe5d92 |
|
08-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- fix typo in comments. - style. - NULL is not 0. - some variables were renamed. - nuke unused logic. (there is no functional change.) Obtained from: KAME
|
#
40e39bbb |
|
06-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
return(code) -> return (code) (reduce diffs against KAME)
|
#
b79274ba |
|
01-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
randomize IPv6 flowlabel when RANDOM_IP_ID is defined. Obtained from: KAME
|
#
18193b6f |
|
01-Oct-2003 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
use arc4random()
|
#
134ea224 |
|
23-Sep-2003 |
Sam Leffler <sam@FreeBSD.org> |
o update PFIL_HOOKS support to current API used by netbsd o revamp IPv4+IPv6+bridge usage to match API changes o remove pfil_head instances from protosw entries (no longer used) o add locking o bump FreeBSD version for 3rd party modules Heavy lifting by: "Max Laier" <max@love2party.net> Supported by: FreeBSD Foundation Obtained from: NetBSD (bits of pfil.h and pfil.c)
|
#
9adc8e4d |
|
11-Mar-2003 |
Sam Leffler <sam@FreeBSD.org> |
correct malloc flag argument Reported by: Kris Kennaway <kris@obsecurity.org>
|
#
1cafed39 |
|
04-Mar-2003 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Update netisr handling; Each SWI now registers its queue, and all queue drain routines are done by swi_net, which allows for better queue control at some future point. Packets may also be directly dispatched to a netisr instead of queued, this may be of interest at some installations, but currently defaults to off. Reviewed by: hsu, silby, jayanth, sam Sponsored by: DARPA, NAI Labs
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
53d96a08 |
|
06-Jan-2003 |
Sam Leffler <sam@FreeBSD.org> |
don't reference a pkthdr after M_MOVE_PKTHDR has "remove it"; instead reference the pkthdr now in the destination of the move Sponsored by: Vernier Networks
|
#
9967cafc |
|
30-Dec-2002 |
Sam Leffler <sam@FreeBSD.org> |
Correct mbuf packet header propagation. Previously, packet headers were sometimes propagated using M_COPY_PKTHDR which actually did something between a "move" and a "copy" operation. This is replaced by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it from the source mbuf) and m_dup_pkthdr which copies the packet header contents including any m_tag chain. This corrects numerous problems whereby mbuf tags could be lost during packet manipulations. These changes also introduce arguments to m_tag_copy and m_tag_copy_chain to specify if the tag copy work should potentially block. This introduces an incompatibility with openbsd which we may want to revisit. Note that move/dup of packet headers does not handle target mbufs that have a cluster bound to them. We may want to support this; for now we watch for it with an assert. Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG. Supported by: Vernier Networks Reviewed by: Robert Watson <rwatson@FreeBSD.org>
|
#
86fea6be |
|
19-Dec-2002 |
Bosko Milekic <bmilekic@FreeBSD.org> |
o Untangle the confusion with the malloc flags {M_WAITOK, M_NOWAIT} and the mbuf allocator flags {M_TRYWAIT, M_DONTWAIT}. o Fix a bpf_compat issue where malloc() was defined to just call bpf_alloc() and pass the 'canwait' flag(s) along. It's been changed to call bpf_alloc() but pass the corresponding M_TRYWAIT or M_DONTWAIT flag (and only one of those two). Submitted by: Hiten Pandya <hiten@unixdaemons.com> (hiten->commit_count++)
|
#
b9234faf |
|
15-Oct-2002 |
Sam Leffler <sam@FreeBSD.org> |
Tie new "Fast IPsec" code into the build. This involves the usual configuration stuff as well as conditional code in the IPv4 and IPv6 areas. Everything is conditional on FAST_IPSEC which is mutually exclusive with IPSEC (KAME IPsec implmentation). As noted previously, don't use FAST_IPSEC with INET6 at the moment. Reviewed by: KAME, rwatson Approved by: silence Supported by: Vernier Networks
|
#
5d846453 |
|
15-Oct-2002 |
Sam Leffler <sam@FreeBSD.org> |
Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month
|
#
0776834a |
|
31-May-2002 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
__FreeBSD__ is not a compiler constant. We must use __FreeBSD_version here. Submitted by: rwatson
|
#
4cc20ab1 |
|
31-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu
|
#
243917fe |
|
19-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred
|
#
88ff5695 |
|
18-Apr-2002 |
SUZUKI Shinsuke <suz@FreeBSD.org> |
just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week
|
#
6008862b |
|
04-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64
|
#
44731cab |
|
01-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@
|
#
72b1d826 |
|
19-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove duplicate extern declarations to silence warnings.
|
#
bedbd47e |
|
08-Jan-2002 |
Mike Smith <msmith@FreeBSD.org> |
Initialise the intrq_present fields at runtime, not link time. This allows us to load protocols at runtime, and avoids the use of common variables. Also fix the ip6_intrq assignment so that it works at all.
|
#
9494d596 |
|
25-Sep-2001 |
Brooks Davis <brooks@FreeBSD.org> |
Make faith loadable, unloadable, and clonable.
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
53dab5fe |
|
02-Jul-2001 |
Brooks Davis <brooks@FreeBSD.org> |
gif(4) and stf(4) modernization: - Remove gif dependencies from stf. - Make gif and stf into modules - Make gif cloneable. PR: kern/27983 Reviewed by: ru, ume Obtained from: NetBSD MFC after: 1 week
|
#
33841545 |
|
10-Jun-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks
|
#
8d672524 |
|
22-May-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
M_COPY_PKTHDR has to be done before MCLGET. Obtained from: KAME
|
#
d23d3055 |
|
20-Apr-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Fix typo in previous commit. Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
|
#
8d642984 |
|
19-Apr-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- Fix to receive icmp6 echo reply within the host itself to ff02::1. - Fix to receive icmp6 echo reply to link-local of itself. Reported by: Eriya Akasaka <eakasaka@rodfbs.org> Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
|
#
7b35f61a |
|
05-Apr-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
- correct logic of per-address input packet counts for lo0 - reject packets to fe80::xxxx%lo0 (xxxx != 1) Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
|
#
9ec54137 |
|
28-Mar-2001 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Make per-address input packet counts for lo0 work. Reported by: bmah Submitted by: Noriyasu KATO <noriyasu.kato@toshiba.co.jp> (via itojun)
|
#
f13bb832 |
|
16-Mar-2001 |
Jun Kuriyama <kuriyama@FreeBSD.org> |
Merge from kame (1.175 -> 1.176): cope with freebsd4 bridge code.
|
#
0634b4a7 |
|
29-Jan-2001 |
Peter Wemm <peter@FreeBSD.org> |
Yikes, these files bogusly #include "loop.h" but didn't use the value. My searching for NLOOP missed them. :-(
|
#
df5e1987 |
|
25-Nov-2000 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Lock down the network interface queues. The queue mutex must be obtained before adding/removing packets from the queue. Also, the if_obytes and if_omcasts fields should only be manipulated under protection of the mutex. IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on the queue. An IF_LOCK macro is provided, as well as the old (mutex-less) versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which needs them, but their use is discouraged. Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF, which takes care of locking/enqueue, and also statistics updating/start if necessary.
|
#
cf9fa8e7 |
|
29-Oct-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>. Correctly document the #includes needed in the manpage. Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.
|
#
5da9f8fa |
|
19-Oct-2000 |
Josef Karthauser <joe@FreeBSD.org> |
Augment the 'ifaddr' structure with a 'struct if_data' to keep statistics on a per network address basis. Teach the IPv4 and IPv6 input/output routines to log packets/bytes against the network address connected to the flow. Teach netstat to display the per-address stats for IP protocols when 'netstat -i' is evoked, instead of displaying the per-interface stats.
|
#
20cb9f9e |
|
23-Sep-2000 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Make ip6fw as loadable module.
|
#
1469c434 |
|
12-Aug-2000 |
Hajimu UMEMOTO <ume@FreeBSD.org> |
Make compilable with -DIPFILTER. Because I don't use ipfilter at all, this is not tested. I don't know if ipfilter is work for IPv6. Submitted by: yoshiaki@kt.rim.or.jp
|
#
c4ac87ea |
|
31-Jul-2000 |
Darren Reed <darrenr@FreeBSD.org> |
activate pfil_hooks and covert ipfilter to use it
|
#
686cdd19 |
|
04-Jul-2000 |
Jun-ichiro itojun Hagino <itojun@FreeBSD.org> |
sync with kame tree as of july00. tons of bug fixes/improvements. API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)
|
#
3389ae93 |
|
19-Apr-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>
|
#
242c5536 |
|
12-Feb-2000 |
Peter Wemm <peter@FreeBSD.org> |
Clean up some loose ends in the network code, including the X.25 and ISO #ifdefs. Clean out unused netisr's and leftover netisr linker set gunk. Tested on x86 and alpha, including world. Approved by: jkh
|
#
210d0432 |
|
29-Jan-2000 |
Yoshinobu Inoue <shin@FreeBSD.org> |
Add ip6fw. Yes it is almost code freeze, but as the result of many thought, now I think this should be added before 4.0... make world check, kernel build check is done. Reviewed by: green Obtained from: KAME project
|
#
67e394e0 |
|
28-Jan-2000 |
Yoshinobu Inoue <shin@FreeBSD.org> |
Backout diffs which should not be included.
|
#
577a30ee |
|
27-Jan-2000 |
Yoshinobu Inoue <shin@FreeBSD.org> |
#This is a null commit to give correct description for the previous change. #Please forget the strange log message of the previous commit . IPv6 multicast routing. kernel IPv6 multicast routing support. pim6 dense mode daemon pim6 sparse mode daemon netstat support of IPv6 multicast routing statistics Merging to the current and testing with other existing multicast routers is done by Tatsuya Jinmei <jinmei@kame.net>, who writes and maintainances the base code in KAME distribution. Make world check and kernel build check was also successful. Obtained from: KAME project
|
#
91ec0a1e |
|
27-Jan-2000 |
Yoshinobu Inoue <shin@FreeBSD.org> |
Sorry I didn't commit these files at the commit just a few minutes before. (IPv6 multicast routing) I think I mistakenly touched TAB and the last arg sys/netinet6 to the cvs commit changed to sys/netinet6/in6_proto.c.
|
#
367d34f8 |
|
24-Jan-2000 |
Brian Somers <brian@FreeBSD.org> |
Move the *intrq variables into net/intrq.c and unconditionally include this in all kernels. Declare some const *intrq_present variables that can be checked by a module prior to using *intrq to queue data. Make the if_tun module capable of processing atm, ip, ip6, ipx, natm and netatalk packets when TUNSIFHEAD is ioctl()d on. Review not required by: freebsd-hackers
|
#
23ecf4b5 |
|
12-Jan-2000 |
Yoshinobu Inoue <shin@FreeBSD.org> |
removed an ours case which think a packet destined to loopback interface with IPv6 loopback addr for its dest or src addr as ours.
|
#
6a800098 |
|
22-Dec-1999 |
Yoshinobu Inoue <shin@FreeBSD.org> |
IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
cfa1ca9d |
|
07-Dec-1999 |
Yoshinobu Inoue <shin@FreeBSD.org> |
udp IPv6 support, IPv6/IPv4 tunneling support in kernel, packet divert at kernel for IPv6/IPv4 translater daemon This includes queue related patch submitted by jburkhol@home.com. Submitted by: queue related patch from jburkhol@home.com Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
e1da8747 |
|
22-Nov-1999 |
Yoshinobu Inoue <shin@FreeBSD.org> |
Removed IPSEC and IPV6FIREWALL because they are not ready yet.
|
#
82cd038d |
|
21-Nov-1999 |
Yoshinobu Inoue <shin@FreeBSD.org> |
KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP for IPv6 yet) With this patch, you can assigne IPv6 addr automatically, and can reply to IPv6 ping. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|