History log of /freebsd-10.2-release/sys/netinet6/in6.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 285830 23-Jul-2015 gjb

- Copy stable/10@285827 to releng/10.2 in preparation for 10.2-RC1
builds.
- Update newvers.sh to reflect RC1.
- Update __FreeBSD_version to reflect 10.2.
- Update default pkg(8) configuration to use the quarterly branch.[1]

Discussed with: re, portmgr [1]
Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 285822 23-Jul-2015 hrs

MFC r273992:

Fix a bug which prevented ND6_IFF_IFDISABLED flag from clearing when
the newly-added IPv6 address was /128.

Approved by: re (gjb)


# 284016 05-Jun-2015 ae

Rework r281868 to not skip RTM announces for tunneling interfaces.
This is direct commit to stable/10.

Tested by: tuexen@


# 282622 08-May-2015 hiren

MFC r261708, r261847, r268525, r274316, r274347, r275593,
r276844, r276847, r279531, r279559, r279564, r279676

A bunch of IPv6 fixes by melifaro, hrs and ae

Major changes:
Simplify nd6_output_lle()
Add refcounting to DAD and fix races and other errors
Implement Enhanced DAD algorithm for IPv6

Suggested by: ae
Tested by: Jason Wolfe <j at nitrology.com>
Sponsored by: Limelight Networks


# 281868 22-Apr-2015 ae

MFC r274988 (with modification):
Skip L2 addresses lookups for tunneling interfaces.

PR: 197286


# 278801 15-Feb-2015 rrs

MFC of r278472
This fixes a bug in the way that the LLE timers for nd6
and arp were being used. They basically would pass in the
mutex to the callout_init. Because they used this method
to the callout system, it was possible to "stop" the callout.
When flushing the table and you stopped the running callout, the
callout_stop code would return 1 indicating that it was going
to stop the callout (that was about to run on the callout_wheel blocked
by the function calling the stop). Now when 1 was returned, it would
lower the reference count one extra time for the stopped timer, then
a few lines later delete the memory. Of course the callout_wheel was
stuck in the lock code and would then crash since it was accessing
freed memory. By using callout_init(c, 1) we always get a 0 back
and the reference counting bug does not rear its head. We do have
to make a few adjustments to the callouts themselves though to make
sure it does the proper thing if rescheduled as well as gets the lock.

Sponsored by: Netflix Inc.


# 278620 12-Feb-2015 ae

MFC r278268:
Print IPv6 address in log message instead of address of pointer.


# 271185 06-Sep-2014 markj

MFC r270348:
Add some missing checks for unsupported interfaces (e.g. pflog(4)) when
handling ioctls. While here, remove duplicated checks for a NULL ifp in
in6_control(): this check is already done near the beginning of the
function.

MFC r270349:
Suppress warnings when retrieving protocol stats from interfaces that
don't support IPv6 (e.g. pflog(4)).

PR: 189117
Approved by: re (gjb)


# 260504 10-Jan-2014 ae

MFC r260151 (by adrian):
Use an RLOCK here instead of an RWLOCK - matching all the other calls
to lla_lookup().

This drastically reduces the very high lock contention when doing parallel
TCP throughput tests (> 1024 sockets) with IPv6.

MFC r260187:
lla_lookup() does modification only when LLE_CREATE is specified.
Thus we can use IF_AFDATA_RLOCK() instead of IF_AFDATA_LOCK() when doing
lla_lookup() without LLE_CREATE flag.

MFC r260217:
Add IF_AFDATA_WLOCK_ASSERT() in case lla_lookup() is called with
LLE_CREATE flag.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 255442 10-Sep-2013 des

Fix the length calculation for the final block of a sendfile(2)
transmission which could be tricked into rounding up to the nearest
page size, leaking up to a page of kernel memory. [13:11]

In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR
and SIOCSIFNETMASK at the socket layer rather than pass them on to the
link layer without validation or credential checks. [SA-13:12]

Prevent cross-mount hardlinks between different nullfs mounts of the
same underlying filesystem. [SA-13:13]

Security: CVE-2013-5666
Security: FreeBSD-SA-13:11.sendfile
Security: CVE-2013-5691
Security: FreeBSD-SA-13:12.ifioctl
Security: CVE-2013-5710
Security: FreeBSD-SA-13:13.nullfs
Approved by: re


# 253970 05-Aug-2013 hrs

- Use time_uptime instead of time_second in data structures for
PF_INET6 in kernel. This fixes various malfunction when the wall time
clock is changed. Bump __FreeBSD_version to 1000041.

- Use clock_gettime(CLOCK_MONOTONIC_FAST) in userland utilities.

MFC after: 1 month


# 253841 31-Jul-2013 hrs

Allocate in6_ifextra (ifp->if_afdata[AF_INET6]) only for IPv6-capable
interfaces. This eliminates unnecessary IPv6 processing for non-IPv6
interfaces.

MFC after: 3 days


# 253101 09-Jul-2013 ae

Correct the size of allocated memory to store array of counters.


# 253086 09-Jul-2013 ae

Migrate structs in6_ifstat and icmp6_ifstat to PCPU counters.


# 252511 02-Jul-2013 hrs

- Allow ND6_IFF_AUTO_LINKLOCAL for IFT_BRIDGE. An interface with IFT_BRIDGE
is initialized with !ND6_IFF_AUTO_LINKLOCAL && !ND6_IFF_ACCEPT_RTADV
regardless of net.inet6.ip6.accept_rtadv and net.inet6.ip6.auto_linklocal.
To configure an autoconfigured link-local address (RFC 4862), the
following rc.conf(5) configuration can be used:

ifconfig_bridge0_ipv6="inet6 auto_linklocal"

- if_bridge(4) now removes IPv6 addresses on a member interface to be
added when the parent interface or one of the existing member
interfaces has an IPv6 address. if_bridge(4) merges each link-local
scope zone which the member interfaces form respectively, so it causes
address scope violation. Removal of the IPv6 addresses prevents it.

- if_lagg(4) now removes IPv6 addresses on a member interfaces
unconditionally.

- Set reasonable flags to non-IPv6-capable interfaces. [*]

Submitted by: rpaulo [*]
MFC after: 1 week


# 250815 19-May-2013 melifaro

Really fix netmask address family this time.

MFC with: r250813


# 250813 19-May-2013 melifaro

Finish r85740 : Make IPv6 netmask has address family set.
This pleases routing daemons like bird.

MFC after: 2 weeks


# 250251 04-May-2013 hrs

Use FF02:0:0:0:0:2:FF00::/104 prefix for IPv6 Node Information Group
Address. Although KAME implementation used FF02:0:0:0:0:2::/96 based on
older versions of draft-ietf-ipngwg-icmp-name-lookup, it has been changed
in RFC 4620.

The kernel always joins the /104-prefixed address, and additionally does
/96-prefixed one only when net.inet6.icmp6.nodeinfo_oldmcprefix=1.
The default value of the sysctl is 1.

ping6(8) -N flag now uses /104-prefixed one. When this flag is specified
twice, it uses /96-prefixed one instead.

Reviewed by: ume
Based on work by: Thomas Scheffler
PR: conf/174957
MFC after: 2 weeks


# 249742 21-Apr-2013 oleg

Plug static llentry leak (ipv4 & ipv6 were affected).

PR: kern/172985
MFC after: 1 month


# 244989 03-Jan-2013 peter

Temporarily revert rev 244678. This is causing loopback problems with
the lo (loopback) interfaces.


# 244678 25-Dec-2012 glebius

The SIOCSIFFLAGS ioctl handler runs if_up()/if_down() that notify
all interested parties in case if interface flag IFF_UP has changed.

However, not only SIOCSIFFLAGS can raise the flag, but SIOCAIFADDR
and SIOCAIFADDR_IN6 can, too. The actual |= is done not in the protocol
code, but in code of interface drivers. To fix this historical layering
violation, we will check whether ifp->if_ioctl(SIOCSIFADDR) raised the
IFF_UP flag, and if it did, run the if_up() handler.

This fixes configuring an address under CARP control on an interface
that was initially !IFF_UP.

P.S. I intentionally omitted handling the IFF_SMART flag. This flag was
never ever used in any driver since it was introduced, and since it
means another layering violation, it should be garbage collected instead
of pretended to be supported.


# 244272 15-Dec-2012 ae

In additional to the tailq of IPv6 addresses add the hash table.
For now use 256 buckets and fnv_hash function. Use xor'ed 32-bit
s6_addr32 parts of in6_addr structure as a hash key. Update
in6_localip and in6_is_addr_deprecated to use hash table for fastest
lookup.

Sponsored by: Yandex LLC
Discussed with: dwmalone, glebius, bz


# 243903 05-Dec-2012 hrs

- Move definition of V_deembed_scopeid to scope6_var.h.
- Deembed scope id in L3 address in in6_lltable_dump().
- Simplify scope id recovery in rtsock routines.
- Remove embedded scope id handling in ndp(8) and route(8) completely.


# 241916 22-Oct-2012 delphij

Remove __P.

Submitted by: kevlo
Reviewed by: md5(1)
MFC after: 2 months


# 241686 18-Oct-2012 andre

Mechanically remove the last stray remains of spl* calls from net*/*.
They have been Noop's for a long time now.


# 238990 02-Aug-2012 glebius

Fix races between in_lltable_prefix_free(), lla_lookup(),
llentry_free() and arptimer():

o Use callout_init_rw() for lle timeout, this allows us safely
disestablish them.
- This allows us to simplify the arptimer() and make it
race safe.
o Consistently use ifp->if_afdata_lock to lock access to
linked lists in the lle hashes.
o Introduce new lle flag LLE_LINKED, which marks an entry that
is attached to the hash.
- Use LLE_LINKED to avoid double unlinking via consequent
calls to llentry_free().
- Mark lle with LLE_DELETED via |= operation istead of =,
so that other flags won't be lost.
o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more
consistent and provide more informative KASSERTs.

The patch is a collaborative work of all submitters and myself.

PR: kern/165863
Submitted by: Andrey Zonov <andrey zonov.org>
Submitted by: Ryan Stone <rysto32 gmail.com>
Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>


# 238967 01-Aug-2012 glebius

Some more whitespace cleanup.


# 238945 31-Jul-2012 glebius

Some style(9) and whitespace changes.

Together with: Andrey Zonov <andrey zonov.org>


# 238222 08-Jul-2012 bz

As mentioned in the commit message of r237571 (copied from a prototype
patch of mine) also check if the 2nd in6_setscope() failed and return
the error in that case.

MFC after: 5 days


# 237571 25-Jun-2012 delphij

Fix a LOR acquiring the if_afdata lock while holding an rtentry lock.
Possibly do some entra work in case we would not get into the
ifa0 != NULL paths later as we already do for the mltaddr before.

XXX We should possibly error in case in6_setscope fails.

Reference: http://lists.freebsd.org/pipermail/freebsd-net/2011-September/029829.html

Submitted by: bz
MFC after: 1 week


# 236615 05-Jun-2012 bz

Plug two interface address refcount leaks in early error return cases
in the ioctl path.

Reported by: rpaulo
Reviewed by: emax
MFC after: 3 days


# 236327 30-May-2012 emax

When we return deprecated addresses, we need to reference them.

Reviewed by: bz, scottl
MFC after: 3 days


# 232054 23-Feb-2012 kmacy

When using flowtable llentrys can outlive the interface with which they're associated
at which the lle_tbl pointer points to freed memory and the llt_free pointer is no longer
valid.

Move the free pointer in to the llentry itself and update the initalization sites.

MFC after: 2 weeks


# 231852 17-Feb-2012 bz

Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:

Extend the so far IPv4-only support for multiple routing tables (FIBs)
introduced in r178888 to IPv6 providing feature parity.

This includes an extended rtalloc(9) KPI for IPv6, the necessary
adjustments to the network stack, and user land support as in netstat.

Sponsored by: Cisco Systems, Inc.
Reviewed by: melifaro (basically)
MFC after: 10 days


# 230506 24-Jan-2012 bz

Plug a possible ifa_ref leak in case of premature return from in6_purgeaddr().

Reviewed by: rwatson
MFC after: 3 days


# 230496 24-Jan-2012 pluknet

Remove the stale XXX rt_newaddrmsg comment.
A routing socket message is generated since r192282.

Reviewed by: bz
MFC after: 3 days


# 230494 24-Jan-2012 bz

Remove unnecessary line break.

MFC after: 3 days


# 229621 05-Jan-2012 jhb

Convert all users of IF_ADDR_LOCK to use new locking macros that specify
either a read lock or write lock.

Reviewed by: bz
MFC after: 2 weeks


# 229465 04-Jan-2012 glebius

Use correct locking when traversing interface address list.

Reviewed by: bz


# 229414 03-Jan-2012 jhb

Grab a reference on the matching interface address (ifa) in the handling
of the SIOC[DG]LIFADDR icotls before dropping the IF_ADDR_LOCK() and
release the reference after using it. This prevents the address from
being potentially freed out from under the ioctl handler.

Reviewed by: bz
MFC after: 1 week


# 229390 03-Jan-2012 jhb

Use TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE() for some loops that
do not modify the queues they iterate over.

Submitted by: glebius


# 228966 29-Dec-2011 jhb

Use queue(3) macros instead of home-rolled versions in several places in
the INET6 code. This includes retiring the 'ndpr_next' and 'pfr_next'
macros.

Submitted by: pluknet (earlier version)
Reviewed by: pluknet


# 228768 21-Dec-2011 glebius

Provide ABI compatibility shim to enable configuring of addresses
with ifconfig(8) prior to r228571.

Requested by: brooks


# 228571 16-Dec-2011 glebius

A major overhaul of the CARP implementation. The ip_carp.c was started
from scratch, copying needed functionality from the old implemenation
on demand, with a thorough review of all code. The main change is that
interface layer has been removed from the CARP. Now redundant addresses
are configured exactly on the interfaces, they run on.

The CARP configuration itself is, as before, configured and read via
SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or
SIOCAIFADDR_IN6 may now be configured to a particular virtual host id,
which makes the prefix redundant.

ifconfig(8) semantics has been changed too: now one doesn't need
to clone carpXX interface, he/she should directly configure a vhid
on a Ethernet interface.

To supply vhid data from the kernel to an application the getifaddrs(8)
function had been changed to pass ifam_data with each address. [1]

The new implementation definitely closes all PRs related to carp(4)
being an interface, and may close several others. It also allows
to run a single redundant IP per interface.

Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for
idea on using ifam_data and for several rounds of reviewing!

PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448
Reviewed by: bz
Submitted by: bz [1]


# 227460 11-Nov-2011 qingli

A default route learned from the RAs could be deleted manually
after its installation. This removal may be accidental and can
prevent the default route from being installed in the future if
the associated default router has the best preference. The cause
is the lack of status update in the default router on the state
of its route installation in the kernel FIB. This patch fixes
the described problem.

Reviewed by: hrs, discussed with hrs
MFC after: 5 days


# 226453 16-Oct-2011 qingli

The code change made in r226040 was incomplete and resulted in
routes such as fe80::1%lo0 no being installed. This patch completes
the original intended fix.

Reviewed by: hrs, bz
MFC after: 3 days


# 226338 13-Oct-2011 glebius

Restore functions in6_ifaddloop() and in6_ifremloop() that were
inlined by Qing Li in his big new-ARP commit. I am going to utilize
them in my newcarp work, and also these functions left declared
in in6_var.h for all the time they were absent.

Reviewed by: bz


# 226040 05-Oct-2011 qingli

The IFA_RTSELF instead of the IFA_ROUTE flag should be checked to
determine if a loopback route should be installed for an interface
IPv6 address. Another condition is the address must not belong to a
looopback interface.

Reviewed by: hrs
MFC after: 3 days


# 225043 20-Aug-2011 bz

Add an in6_localip() helper function as in6_localaddr() is not doing what
people think: returning true for an address in any connected subnet, not
necessarily on the local machine.

Sponsored by: Sandvine Incorporated
MFC after: 2 weeks
Approved by: re (kib)


# 223862 08-Jul-2011 zec

Permit ARP to proceed for IPv4 host routes for which the gateway is the
same as the host address. This already works fine for INET6 and ND6.

While here, remove two function pointers from struct lltable which are
only initialized but never used.

MFC after: 3 days


# 222730 06-Jun-2011 hrs

- Make the code more proactively clear an ND6_IFF_IFDISABLED flag when
an explicit action for INET6 configuration happens. The changes are:

1. When an ND6 flag is changed via SIOCSIFINFO_FLAGS ioctl,
setting ND6_IFF_ACCEPT_RTADV and/or ND6_IFF_AUTO_LINKLOCAL now triggers
an attempt to clear the ND6_IFF_IFDISABLED flag.

2. When an AF_INET6 address is added successfully to an interface and
it is marked as ND6_IFF_IFDISABLED, an attempt to clear the
ND6_IFF_IFDISABLED happens.

This simplifies ND6_IFF_IFDISABLED flag manipulation by users via ifconfig(8);
in most cases manual configuration is no longer needed.

- When ND6_IFF_AUTO_LINKLOCAL is set and no link-local address is assigned to
an interface, SIOCSIFINFO_FLAGS ioctl now calls in6_ifattach() to configure
a link-local address.

This change ensures link-local address configuration when "ifconfig IF inet6"
command is invoked. For example, "ifconfig IF inet6 auto_linklocal" now
always try to configure an LL addr even if ND6_IFF_AUTO_LINKLOCAL is already
set to 1 (i.e. down/up cycle is no longer needed).

Reviewed by: bz


# 222143 20-May-2011 qingli

The statically configured (permanent) ARP entries are removed when an
interface is brought down, even though the interface address is still
valid. This patch maintains the permanent ARP entries as long as the
interface address (having the same prefix as that of the ARP entries)
is valid.

Reviewed by: delphij
MFC after: 5 days


# 219819 21-Mar-2011 jeff

- Merge changes to the base system to support OFED. These include
a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND,
and other miscellaneous small features.


# 216022 29-Nov-2010 bz

Plug well observed races on la_hold entries with the callout handler.

Call the handler function with the lock held, return unlocked as we
might free the entry. Rework functions later in the call graph to be
either called with the lock held or, only if needed, unlocked.

Place asserts to document and tighten assumptions on various lle locking,
which were not always true before.

We call nd6_ns_output() unlocked and the assignment of ip6->ip6_src was
decentralized to minimize possible complexity introduced with the formerly
missing locking there. This also resulted in a push down of local
variable scopes into smaller blocks.

Reported by: many
PR: kern/148857
Submitted by: Dmitrij Tejblum (tejblum yandex-team.ru) (original version)
MFC After: 4 days


# 208284 19-May-2010 alfred

Fix our version of IPv6 address representation.

We do not respect rules 3 and 4 in the required list:

1. omit leading zeros

2. "::" used to their maximum extent whenever possible

3. "::" used where shortens address the most

4. "::" used in the former part in case of a tie breaker

5. do not shorten one 16 bit 0 field

6. use lower case

http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04.html

Submitted by: Kalluru Abhiram @ Juniper Networks
Obtained from: Juniper Networks
Reviewed by: hrs, dougb


# 207268 27-Apr-2010 kib

Provide 32bit compat for SIOCGDEFIFACE_IN6.

Based on submission by: pluknet gmail com
Reviewed by: emaste
MFC after: 2 weeks


# 206481 11-Apr-2010 bz

Plug reference leaks in the link-layer code ("new-arp") that previously
prevented the link-layer entry from being freed.

In both in.c and in6.c (though that code path seems to be basically dead)
plug a reference leak in case of a pending callout being drained.

In if_ether.c consistently add a reference before resetting the callout
and in case we canceled a pending one remove the reference for that.
In the final case in arptimer, before freeing the expired entry, remove
the reference again and explicitly call callout_stop() to clear the active
flag.

In nd6.c:nd6_free() we are only ever called from the callout function and
thus need to remove the reference there as well before calling into
llentry_free().

In if_llatbl.c when freeing entire tables make sure that in case we cancel
a pending callout to remove the reference as well.

Reviewed by: qingli (earlier version)
MFC after: 10 days
Problem observed, patch tested by: simon on ipv6gw.f.o,
Christian Kratzer (ck cksoft.de),
Evgenii Davidov (dado korolev-net.ru)
PR: kern/144564
Configurations still affected: with options FLOWTABLE


# 201282 30-Dec-2009 qingli

The proxy arp entries could not be added into the system over the
IFF_POINTOPOINT link types. The reason was due to the routing
entry returned from the kernel covering the remote end is of an
interface type that does not support ARP. This patch fixes this
problem by providing a hint to the kernel routing code, which
indicates the prefix route instead of the PPP host route should
be returned to the caller. Since a host route to the local end
point is also added into the routing table, and there could be
multiple such instantiations due to multiple PPP links can be
created with the same local end IP address, this patch also fixes
the loopback route installation failure problem observed prior to
this patch. The reference count of loopback route to local end would
be either incremented or decremented. The first instantiation would
create the entry and the last removal would delete the route entry.

MFC after: 5 days


# 198418 23-Oct-2009 qingli

Use the correct option name in the preprocessor command to enable
or disable diagnostic messages.

Reviewed by: ru
MFC after: 3 days


# 197227 15-Sep-2009 qingli

Self pointing routes are installed for configured interface addresses
and address aliases. After an interface is brought down and brought
back up again, those self pointing routes disappeared. This patch
ensures after an interface is brought back up, the loopback routes
are reinstalled properly.

Reviewed by: bz
MFC after: immediately


# 197138 12-Sep-2009 hrs

Improve flexibility of receiving Router Advertisement and
automatic link-local address configuration:

- Convert a sysctl net.inet6.ip6.accept_rtadv to one for the
default value of a per-IF flag ND6_IFF_ACCEPT_RTADV, not a
global knob. The default value of the sysctl is 0.

- Add a new per-IF flag ND6_IFF_AUTO_LINKLOCAL and convert a
sysctl net.inet6.ip6.auto_linklocal to one for its default
value. The default value of the sysctl is 1.

- Make ND6_IFF_IFDISABLED more robust. It can be used to disable
IPv6 functionality of an interface now.

- Receiving RA is allowed if ip6_forwarding==0 *and*
ND6_IFF_ACCEPT_RTADV is set on that interface. The former
condition will be revisited later to support a "host + router" box
like IPv6 CPE router. The current behavior is compatible with
the older releases of FreeBSD.

- The ifconfig(8) now supports these ND6 flags as well as "nud",
"prefer_source", and "disabled" in ndp(8). The ndp(8) now
supports "auto_linklocal".

Discussed with: bz and jinmei
Reviewed by: bz
MFC after: 3 days


# 196871 05-Sep-2009 qingli

The addresses that are assigned to the loopback interface
should be part of the kernel routing table.

Reviewed by: bz
MFC after: immediately


# 196864 05-Sep-2009 qingli

This patch fixes the following issues:
- Interface link-local address is not reachable within the
node that owns the interface, this is due to the mismatch
in address scope as the result of the installed interface
address loopback route. Therefore for each interface
address loopback route, the rt_gateway field (of AF_LINK
type) will be used to track which interface a given
address belongs to. This will aid the address source to
use the proper interface for address scope/zone validation.
- The loopback address is not reachable. The root cause is
the same as the above.
- Empty nd6 entries are created for the IPv6 loopback addresses
only for validation reason. Doing so will eliminate as much
of the special case (loopback addresses) handling code
as possible, however, these empty nd6 entries should not
be returned to the userland applications such as the
"ndp" command.
Since both of the above issues contain common files, these
files are committed together.

Reviewed by: bz
MFC after: immediately


# 196535 25-Aug-2009 rwatson

Use locks specific to the lltable code, rather than borrow the ifnet
list/index locks, to protect link layer address tables. This avoids
lock order issues during interface teardown, but maintains the bug that
sysctl copy routines may be called while a non-sleepable lock is held.

Reviewed by: bz, kmacy
MFC after: 3 days


# 196481 23-Aug-2009 rwatson

Rework global locks for interface list and index management, correcting
several critical bugs, including race conditions and lock order issues:

Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
sxlock. Either can be held to stablize the lists and indexes, but both
are required to write. This allows the list to be held stable in both
network interrupt contexts and sleepable user threads across sleeping
memory allocations or device driver interactions. As before, writes to
the interface list must occur from sleepable contexts.

Reviewed by: bz, julian
MFC after: 3 days


# 196152 12-Aug-2009 qingli

A piece of code was added to install a host route when an IPv6 interface
address is configured with a /128 prefix. This is no longer necessary due
to r192011. In fact that code conflicts with r192011. This patch removes
the host route installation when detecting the /128 prefix, and instead
let the code added by r192011 to install the loopback route for that IPv6
interface address.

Reviewed by: bz
Approved by: re


# 196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# 195914 27-Jul-2009 qingli

This patch does the following:

- Allow loopback route to be installed for address assigned to
interface of IFF_POINTOPOINT type.
- Install loopback route for an IPv4 interface addreess when the
"useloopback" sysctl variable is enabled. Similarly, install
loopback route for an IPv6 interface address when the sysctl variable
"nd6_useloopback" is enabled. Deleting loopback routes for interface
addresses is unconditional in case these sysctl variables were
disabled after an interface address has been assigned.

Reviewed by: bz
Approved by: re


# 195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 195643 12-Jul-2009 qingli

This patch adds a host route to an interface address (that is assigned
to a non loopback/ppp link type) through the loopback interface. Prior
to the new L2/L3 rewrite, this host route was explicitly created when
processing the IPv6 address assignment. This loopback host route is
deleted when that IPv6 address is removed from the interface.

Reviewed by: bz, gnn
Approved by: re


# 195102 27-Jun-2009 rwatson

In in6_update_ifa(), jump to 'cleanup' rather than returning directly
in one additional case, avoiding an ifaddr reference leak.

Defer releasing the in6_ifaddr's in6_ifaddrhead reference until the
end of in6_unlink_ifa(), as callers are inconsistent regarding whether
or not they hold a reference across the call. This avoids using the
ifaddr after it may have been freed.

Reported by: tegge
Reviewed by: tegge
Approved by: re (blanket)
MFC after: 6 weeks


# 194971 25-Jun-2009 rwatson

Add address list locking for in6_ifaddrhead/ia_link: as with locking
for in_ifaddrhead, we stick with an rwlock for the time being, which
we will revisit in the future with a possible move to rmlocks.

Some pieces of code require significant further reworking to be
safe from all classes of writer-writer races.

Reviewed by: bz
MFC after: 6 weeks


# 194943 25-Jun-2009 rwatson

Clean up reference management in in6_update_ifa and in6_unlink_ifa, and
in particular, add a reference for in6_ifaddrhead since we do remove a
reference for it when an IPv6 address is removed. This fixes ifconfig
delete of an IPv6 alias.

Reported by: tegge
MFC after: 6 weeks


# 194907 24-Jun-2009 rwatson

Convert netinet6 to using queue(9) rather than hand-crafted linked lists
for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt
the code styles and conventions present in netinet where possible.

Reviewed by: gnn, bz
MFC after: 6 weeks (possibly not MFCable?)


# 194760 23-Jun-2009 rwatson

Modify most routines returning 'struct ifaddr *' to return references
rather than pointers, requiring callers to properly dispose of those
references. The following routines now return references:

ifaddr_byindex
ifa_ifwithaddr
ifa_ifwithbroadaddr
ifa_ifwithdstaddr
ifa_ifwithnet
ifaof_ifpforaddr
ifa_ifwithroute
ifa_ifwithroute_fib
rt_getifa
rt_getifa_fib
IFP_TO_IA
ip_rtaddr
in6_ifawithifp
in6ifa_ifpforlinklocal
in6ifa_ifpwithaddr
in6_ifadd
carp_iamatch6
ip6_getdstifaddr

Remove unused macro which didn't have required referencing:

IFP_TO_IA6

This closes many small races in which changes to interface
or address lists while an ifaddr was in use could lead to use of freed
memory (etc). In a few cases, add missing if_addr_list locking
required to safely acquire references.

Because of a lack of deep copying support, we accept a race in which
an in6_ifaddr pointed to by mbuf tags and extracted with
ip6_getdstifaddr() doesn't hold a reference while in transmit. Once
we have mbuf tag deep copy support, this can be fixed.

Reviewed by: bz
Obtained from: Apple, Inc. (portions)
MFC after: 6 weeks (portions)


# 194602 21-Jun-2009 rwatson

Clean up common ifaddr management:

- Unify reference count and lock initialization in a single function,
ifa_init().
- Move tear-down from a macro (IFAFREE) to a function ifa_free().
- Move reference count bump from a macro (IFAREF) to a function ifa_ref().
- Instead of using a u_int protected by a mutex to refcount(9) for
reference count management.

The ifa_mtx is now used for exactly one ioctl, and possibly should be
removed.

MFC after: 3 weeks


# 193893 10-Jun-2009 cperciva

Prevent integer overflow in direct pipe write code from circumventing
virtual-to-physical page lookups. [09:09]

Add missing permissions check for SIOCSIFINFO_IN6 ioctl. [09:10]

Fix buffer overflow in "autokey" negotiation in ntpd(8). [09:11]

Approved by: so (cperciva)
Approved by: re (not really, but SVN wants this...)
Security: FreeBSD-SA-09:09.pipe
Security: FreeBSD-SA-09:10.ipv6
Security: FreeBSD-SA-09:11.ntpd


# 193744 08-Jun-2009 bz

After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.


# 192895 27-May-2009 jamie

Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails. Child jails may be restricted more than their parents,
but never less. Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system. Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings. The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by: bz (mentor)


# 192476 20-May-2009 qingli

When an interface address is removed and the last prefix
route is also being deleted, the link-layer address table
(arp or nd6) will flush those L2 llinfo entries that match
the removed prefix.

Reviewed by: kmacy


# 192282 18-May-2009 qingli

This patch resolves the following issues:

-- A routing socket message is not generated when an IPv6 address is
either inserted or deleted from an interface. The missing routing
message problem was discovered by Randall Stewart and Michael Tuxen
during SCTP testing.

-- Previously when an IPv6 address is configured on an interface, if the
prefix length is /128, then a host route is instaleld in the kernel
for this address. But this host route is not deleted when that IPv6
address is removed from the interface.

-- Routes to the link-local all-nodes multicast address and the
interface-local all-nodes multicast address are not removed when
the last IPv6 address is removed from an interface.

Reviewed by: bz, gnn


# 191672 29-Apr-2009 bms

Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit:
import from p4 bms_netdev. Summary of changes:

* Connect netinet6/in6_mcast.c to build.
The legacy KAME KPIs are mostly preserved.
* Eliminate now dead code from ip6_output.c.
Don't do mbuf bingo, we are not going to do RFC 2292 style
CMSG tricks for multicast options as they are not required
by any current IPv6 normative reference.
* Refactor transports (UDP, raw_ip6) to do own mcast filtering.
SCTP, TCP unaffected by this change.
* Add ip6_msource, in6_msource structs to in6_var.h.
* Hookup mld_ifinfo state to in6_ifextra, allocate from
domifattach path.
* Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced.
Kernel consumers which need this should use in6m_lookup().
* Refactor IPv6 socket group memberships to use a vector (like IPv4).
* Update ifmcstat(8) for IPv6 SSM.
* Add witness lock order for IN6_MULTI_LOCK.
* Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths.
* Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup.
* Update carp(4) for new IPv6 SSM KPIs.
* Virtualize ip6_mrouter socket.
Changes mostly localized to IPv6 MROUTING.
* Don't do a local group lookup in MROUTING.
* Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge().
* Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode.
* Bump __FreeBSD_version to 800084.
* Update UPDATING.

NOTE WELL:
* This code hasn't been tested against real MLDv2 queriers
(yet), although the on-wire protocol has been verified in Wireshark.
* There are a few unresolved issues in the socket layer APIs to
do with scope ID propagation.
* There is a LOR present in ip6_output()'s use of
in6_setscope() which needs to be resolved. See comments in mld6.c.
This is believed to be benign and can't be avoided for the moment
without re-introducing an indirect netisr.

This work was mostly derived from the IGMPv3 implementation, and
has been sponsored by a third party.


# 191340 20-Apr-2009 rwatson

Prefer structure fields (ifa_link) to macro aliases for them
(ifa_list).

MFC after: 2 weeks


# 191337 20-Apr-2009 rwatson

Acquire interface address list lock around access to if_addrhead,
closing several writer-writer races, and some read-write races.

MFC after: 2 weeks


# 191336 20-Apr-2009 rwatson

Use TAILQ_FOREACH() and TAILQ_FOREACH_SAFE() rather than manually
accessing queue(9) structure fields for if_addrhead.

Prefer FreeBSD field name if_addrhead to compatibility macro
if_addrlist.

MFC after: 2 weeks


# 191323 20-Apr-2009 rwatson

Close some but not all writer-writer races when maintaining IPv6
interface address lists by locking the interface address list lock.

MFC after: 2 weeks


# 189851 15-Mar-2009 rwatson

Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced
in FreeBSD 5.x to allow network device drivers to run with Giant
despite the network stack being Giant-free. This significantly
simplifies calls into ioctl() on network interfaces, especially
in the multicast code, as well as eliminates deferred invocation
of interface if_start routines.

Disable the build on device drivers still depending on
IFF_NEEDSGIANT as they no longer compile. They will be removed
in a few weeks if they haven't been made MPSAFE in that time.
Disabled drivers:

if_ar
if_axe
if_aue
if_cdce
if_cue
if_kue
if_ray
if_rue
if_rum
if_sr
if_udav
if_ural
if_zyd

Drivers that were already disabled because of tty changes:

if_ppp
if_sl

Discussed on: arch@


# 189106 27-Feb-2009 bz

For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.


# 188144 05-Feb-2009 jamie

Standardize the various prison_foo_ip[46] functions and prison_if to
return zero on success and an error code otherwise. The possible errors
are EADDRNOTAVAIL if an address being checked for doesn't match the
prison, and EAFNOSUPPORT if the prison doesn't have any addresses in
that address family. For most callers of these functions, use the
returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or
EINVAL.

Always include a jailed() check in these functions, where a non-jailed
cred always returns success (and makes no changes). Remove the explicit
jailed() checks that preceded many of the function calls.

Approved by: bz (mentor)


# 187946 31-Jan-2009 bz

Like with r185713 make sure to not leak a lock as rtalloc1(9) returns
a locked route. Thus we have to use RTFREE_LOCKED(9) to get it unlocked
and rtfree(9)d rather than just rtfree(9)d.

Since the PR was filed, new places with the same problem were added
with new code. Also check that the rt is valid before freeing it
either way there.

PR: kern/129793
Submitted by: Dheeraj Reddy <dheeraj@ece.gatech.edu>
MFC after: 2 weeks
Committed from: Bugathon #6


# 187094 12-Jan-2009 qingli

Revive the RTF_LLINFO flag in route.h. The kernel code is guarded
by the new kernel option COMPAT_ROUTE_FLAGS for binary backward
compatibility. The RTF_LLDATA flag maps to the same value as RTF_LLINFO.
RTF_LLDATA is used by the arp and ndp utilities. The RTF_LLDATA flag is
always returned to the userland regardless whether the COMPAT_ROUTE_FLAGS
is defined.


# 186980 09-Jan-2009 bz

Restrict arp, ndp and theoretically the FIB listing (if not
read with libkvm) to the addresses of a prison, when inside a
jail. [1]
As the patch from the PR was pre-'new-arp', add checks to the
llt_dump handlers as well.

While touching RTM_GET in route_output(), consistently use
curthread credentials rather than the creds from the socket
there. [2]

PR: kern/68189
Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1]
Discussed with: rwatson [2]
Reviewed by: rwatson
MFC after: 4 weeks


# 186948 09-Jan-2009 bz

Make SIOCGIFADDR and related, as well as SIOCGIFADDR_IN6 and related
jail-aware. Up to now we returned the first address of the interface
for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for
programs querying for an address but running inside a jail, as the
address returned usually did not belong to the jail.
Like for v6, if there was an ifr_addr given on v4, you could probe
for more addresses on the interfaces that you were not allowed to see
from inside a jail. Return an error (EADDRNOTAVAIL) in that case
now unless the address is on the given interface and valid for the
jail.

PR: kern/114325
Reviewed by: rwatson
MFC after: 4 weeks


# 186708 03-Jan-2009 qingli

Some modules such as SCTP supplies a valid route entry as an input argument
to ip_output(). The destionation is represented in a sockaddr{} object
that may contain other pieces of information, e.g., port number. This
same destination sockaddr{} object may be passed into L2 code, which
could be used to create a L2 entry. Since there exists a L2 table per
address family, the L2 lookup function can make address family specific
comparison instead of the generic bcmp() operation over the entire
sockaddr{} structure.

Note in the IPv6 case the sin6_scope_id is not compared because the
address is currently stored in the embedded form inside the kernel.
The in6_lltable_lookup() has to account for the scope-id if this
storage format were to change in the future.


# 186500 26-Dec-2008 qingli

This checkin addresses a couple of issues:
1. The "route" command allows route insertion through the interface-direct
option "-iface". During if_attach(), an sockaddr_dl{} entry is created
for the interface and is part of the interface address list. This
sockaddr_dl{} entry describes the interface in detail. The "route"
command selects this entry as the "gateway" object when the "-iface"
option is present. The "arp" and "ndp" commands also interact with the
kernel through the routing socket when adding and removing static L2
entries. The static L2 information is also provided through the
"gateway" object with an AF_LINK family type, similar to what is
provided by the "route" command. In order to differentiate between
these two types of operations, a RTF_LLDATA flag is introduced. This
flag is set by the "arp" and "ndp" commands when issuing the add and
delete commands. This flag is also set in each L2 entry returned by the
kernel. The "arp" and "ndp" command follows a convention where a RTM_GET
is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills
in the fields for a "rtm" object, which is reinjected into the kernel by
a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET
is a prefix route, so the RTF_LLDATA flag must be specified when issuing
the RTM_ADD/DELETE messages.

2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the
specification for retrieving L2 information. Also optimized the
code logic.

Reviewed by: julian


# 186392 22-Dec-2008 qingli

Similar to the INET case, do not destroy the nd6 entries for
interface addresses until those addresses are removed. I already
made the patch in INET but forgot to bring the code over for
INET6.


# 186216 17-Dec-2008 qingli

A couple of files were not meant to be committed.


# 186215 17-Dec-2008 qingli

in6_clsroute() was applied to prefix routes causing some
of them to expire. in6_clsroute() was only applied to
cloned routes that are no longer applicable after the
arp-v2 commit.


# 186158 16-Dec-2008 kmacy

check return from lla_lookup against NULL not zero


# 186150 16-Dec-2008 kmacy

unlock and destroy an llentry's lock before freeing

Found by: sam


# 186119 15-Dec-2008 qingli

This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion


# 185571 02-Dec-2008 bz

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


# 184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# 183550 02-Oct-2008 zec

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 181888 20-Aug-2008 julian

Fix some of the formatting fixes.. It's amazing how some thing stand out
in a commit message.


# 181887 20-Aug-2008 julian

A bunch of formatting fixes brough to light by, or created by the Vimage commit
a few days ago.


# 181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


# 180291 05-Jul-2008 rwatson

Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables. Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout(). A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.

Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.

MFC after: 3 weeks


# 178888 09-May-2008 julian

Add code to allow the system to handle multiple routing tables.
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)

Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.

From my notes:

-----

One thing where FreeBSD has been falling behind, and which by chance I
have some time to work on is "policy based routing", which allows
different
packet streams to be routed by more than just the destination address.

Constraints:
------------

I want to make some form of this available in the 6.x tree
(and by extension 7.x) , but FreeBSD in general needs it so I might as
well do it in -current and back port the portions I need.

One of the ways that this can be done is to have the ability to
instantiate multiple kernel routing tables (which I will now
refer to as "Forwarding Information Bases" or "FIBs" for political
correctness reasons). Which FIB a particular packet uses to make
the next hop decision can be decided by a number of mechanisms.
The policies these mechanisms implement are the "Policies" referred
to in "Policy based routing".

One of the constraints I have if I try to back port this work to
6.x is that it must be implemented as a EXTENSION to the existing
ABIs in 6.x so that third party applications do not need to be
recompiled in timespan of the branch.

This first version will not have some of the bells and whistles that
will come with later versions. It will, for example, be limited to 16
tables in the first commit.
Implementation method, Compatible version. (part 1)
-------------------------------
For this reason I have implemented a "sufficient subset" of a
multiple routing table solution in Perforce, and back-ported it
to 6.x. (also in Perforce though not always caught up with what I
have done in -current/P4). The subset allows a number of FIBs
to be defined at compile time (8 is sufficient for my purposes in 6.x)
and implements the changes needed to allow IPV4 to use them. I have not
done the changes for ipv6 simply because I do not need it, and I do not
have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.

Other protocol families are left untouched and should there be
users with proprietary protocol families, they should continue to work
and be oblivious to the existence of the extra FIBs.

To understand how this is done, one must know that the current FIB
code starts everything off with a single dimensional array of
pointers to FIB head structures (One per protocol family), each of
which in turn points to the trie of routes available to that family.

The basic change in the ABI compatible version of the change is to
extent that array to be a 2 dimensional array, so that
instead of protocol family X looking at rt_tables[X] for the
table it needs, it looks at rt_tables[Y][X] when for all
protocol families except ipv4 Y is always 0.
Code that is unaware of the change always just sees the first row
of the table, which of course looks just like the one dimensional
array that existed before.

The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
are all maintained, but refer only to the first row of the array,
so that existing callers in proprietary protocols can continue to
do the "right thing".
Some new entry points are added, for the exclusive use of ipv4 code
called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
which have an extra argument which refers the code to the correct row.

In addition, there are some new entry points (currently called
rtalloc_fib() and friends) that check the Address family being
looked up and call either rtalloc() (and friends) if the protocol
is not IPv4 forcing the action to row 0 or to the appropriate row
if it IS IPv4 (and that info is available). These are for calling
from code that is not specific to any particular protocol. The way
these are implemented would change in the non ABI preserving code
to be added later.

One feature of the first version of the code is that for ipv4,
the interface routes show up automatically on all the FIBs, so
that no matter what FIB you select you always have the basic
direct attached hosts available to you. (rtinit() does this
automatically).

You CAN delete an interface route from one FIB should you want
to but by default it's there. ARP information is also available
in each FIB. It's assumed that the same machine would have the
same MAC address, regardless of which FIB you are using to get
to it.

This brings us as to how the correct FIB is selected for an outgoing
IPV4 packet.

Firstly, all packets have a FIB associated with them. if nothing
has been done to change it, it will be FIB 0. The FIB is changed
in the following ways.

Packets fall into one of a number of classes.

1/ locally generated packets, coming from a socket/PCB.
Such packets select a FIB from a number associated with the
socket/PCB. This in turn is inherited from the process,
but can be changed by a socket option. The process in turn
inherits it on fork. I have written a utility call setfib
that acts a bit like nice..

setfib -3 ping target.example.com # will use fib 3 for ping.

It is an obvious extension to make it a property of a jail
but I have not done so. It can be achieved by combining the setfib and
jail commands.

2/ packets received on an interface for forwarding.
By default these packets would use table 0,
(or possibly a number settable in a sysctl(not yet)).
but prior to routing the firewall can inspect them (see below).
(possibly in the future you may be able to associate a FIB
with packets received on an interface.. An ifconfig arg, but not yet.)

3/ packets inspected by a packet classifier, which can arbitrarily
associate a fib with it on a packet by packet basis.
A fib assigned to a packet by a packet classifier
(such as ipfw) would over-ride a fib associated by
a more default source. (such as cases 1 or 2).

4/ a tcp listen socket associated with a fib will generate
accept sockets that are associated with that same fib.

5/ Packets generated in response to some other packet (e.g. reset
or icmp packets). These should use the FIB associated with the
packet being reponded to.

6/ Packets generated during encapsulation.
gif, tun and other tunnel interfaces will encapsulate using the FIB
that was in effect withthe proces that set up the tunnel.
thus setfib 1 ifconfig gif0 [tunnel instructions]
will set the fib for the tunnel to use to be fib 1.

Routing messages would be associated with their
process, and thus select one FIB or another.
messages from the kernel would be associated with the fib they
refer to and would only be received by a routing socket associated
with that fib. (not yet implemented)

In addition Netstat has been edited to be able to cope with the
fact that the array is now 2 dimensional. (It looks in system
memory using libkvm (!)). Old versions of netstat see only the first FIB.

In addition two sysctls are added to give:
a) the number of FIBs compiled in (active)
b) the default FIB of the calling process.

Early testing experience:
-------------------------

Basically our (IronPort's) appliance does this functionality already
using ipfw fwd but that method has some drawbacks.

For example,
It can't fully simulate a routing table because it can't influence the
socket's choice of local address when a connect() is done.

Testing during the generating of these changes has been
remarkably smooth so far. Multiple tables have co-existed
with no notable side effects, and packets have been routes
accordingly.

ipfw has grown 2 new keywords:

setfib N ip from anay to any
count ip from any to any fib N

In pf there seems to be a requirement to be able to give symbolic names to the
fibs but I do not have that capacity. I am not sure if it is required.

SCTP has interestingly enough built in support for this, called VRFs
in Cisco parlance. it will be interesting to see how that handles it
when it suddenly actually does something.

Where to next:
--------------------

After committing the ABI compatible version and MFCing it, I'd
like to proceed in a forward direction in -current. this will
result in some roto-tilling in the routing code.

Firstly: the current code's idea of having a separate tree per
protocol family, all of the same format, and pointed to by the
1 dimensional array is a bit silly. Especially when one considers that
there is code that makes assumptions about every protocol having the
same internal structures there. Some protocols don't WANT that
sort of structure. (for example the whole idea of a netmask is foreign
to appletalk). This needs to be made opaque to the external code.

My suggested first change is to add routing method pointers to the
'domain' structure, along with information pointing the data.
instead of having an array of pointers to uniform structures,
there would be an array pointing to the 'domain' structures
for each protocol address domain (protocol family),
and the methods this reached would be called. The methods would have
an argument that gives FIB number, but the protocol would be free
to ignore it.

When the ABI can be changed it raises the possibilty of the
addition of a fib entry into the "struct route". Currently,
the structure contains the sockaddr of the desination, and the resulting
fib entry. To make this work fully, one could add a fib number
so that given an address and a fib, one can find the third element, the
fib entry.

Interaction with the ARP layer/ LL layer would need to be
revisited as well. Qing Li has been working on this already.

This work was sponsored by Ironport Systems/Cisco

Reviewed by: several including rwatson, bz and mlair (parts each)
Obtained from: Ironport systems/Cisco


# 175630 24-Jan-2008 bz

Replace the last susers calls in netinet6/ with privilege checks.

Introduce a new privilege allowing to set certain IP header options
(hop-by-hop, routing headers).

Leave a few comments to be addressed later.

Reviewed by: rwatson (older version, before addressing his comments)


# 175162 08-Jan-2008 obrien

un-__P()


# 174510 10-Dec-2007 obrien

Clean up VCS Ids.


# 174376 06-Dec-2007 julian

Remove more dup'd code
MFC After: 1 week


# 174375 06-Dec-2007 julian

remove duped code

Reviewed By: gnn
MRC after: 1 week


# 171260 05-Jul-2007 delphij

Space cleanup

Approved by: re (rwatson)


# 171259 05-Jul-2007 delphij

ANSIfy[1] plus some style cleanup nearby.

Discussed with: gnn, rwatson
Submitted by: Karl Sj?dahl - dunceor <dunceor gmail com> [1]
Approved by: re (rwatson)


# 170202 02-Jun-2007 jinmei

fixed memory leak for IPv6 multicast membership information associated
with interface addresses.

Approved by: gnn (mentor)
MFC after: 1 week


# 170200 02-Jun-2007 jinmei

simplified the fix in rev. 1.69 by replacing RT_REMREF+RT_UNLOCK with
RTFREE_LOCKED.

Approved by: gnn (mentor)


# 169975 25-May-2007 jinmei

do not directly call rtfree() to meet an assumption in the callee.
(this fix suppresses a warning message appearing in the boot time on
IPv6-enabled systems)

Approved by: gnn (mentor)


# 166938 24-Feb-2007 bms

Make IPv6 multicast forwarding dynamically loadable from a GENERIC kernel.
It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko,
if and only if IPv6 support is enabled for loadable modules.
Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).


# 165287 16-Dec-2006 bz

In ip6_sprintf print the addresses in a more common/readable
format eliminating leading zeros like in :0001 -> :1.

Reviewed by: mlaier


# 165118 12-Dec-2006 bz

MFp4: 92972, 98913 + one more change

In ip6_sprintf no longer use and return one of eight static buffers
for printing/logging ipv6 addresses.
The caller now has to hand in a sufficiently large buffer as first
argument.


# 164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 162540 22-Sep-2006 suz

fixed a bug that IPv6 packets arriving to stf are not accepted.
(a degrade introduced in in6.c Rev 1.61)

PR: kern/103415
Submitted by: JINMEI Tatuya
MFC after: 1 week


# 160981 04-Aug-2006 brooks

With exception of the if_name() macro, all definitions in net_osdep.h
were unused or already in if_var.h so add if_name() to if_var.h and
remove net_osdep.h along with all references to it.

Longer term we may want to kill off if_name() entierly since all modern
BSDs have if_xname variables rendering it unnecessicary.


# 160038 29-Jun-2006 yar

There is a consensus that ifaddr.ifa_addr should never be NULL,
except in places dealing with ifaddr creation or destruction; and
in such special places incomplete ifaddrs should never be linked
to system-wide data structures. Therefore we can eliminate all the
superfluous checks for "ifa->ifa_addr != NULL" and get ready
to the system crashing honestly instead of masking possible bugs.

Suggested by: glebius, jhb, ru


# 159390 08-Jun-2006 gnn

Fix spurious warnings from neighbor discovery when working with IPv6 over
point to point tunnels (gif).

PR: 93220
Submitted by: Jinmei Tatuya
MFC after: 1 week


# 155454 08-Feb-2006 gnn

Fix for an inappropriate bzero of the ICMPv6 stats. The code was zero'ing the wrong structure member but setting the correct one.

Submitted by: James dot Juran at baesystems dot com
Reviewed by: gnn
MFC after: 1 week


# 151915 31-Oct-2005 suz

statically configured IPv6 address is properly added/deleted now

Obtained from: KAME
Reported in: freebsd-net@freebsd
MFC after: 1 day


# 151546 22-Oct-2005 suz

fixed a compilation failure on amd64/sparc64/ia64

Submitted by: max
MFC after: 2 month


# 151539 21-Oct-2005 suz

sync with KAME regarding NDP

- introduced fine-grain-timer to manage ND-caches and IPv6 Multicast-Listeners
- supports Router-Preference <draft-ietf-ipv6-router-selection-07.txt>
- better prefix lifetime management
- more spec-comformant DAD advertisement
- updated RFC/internet-draft revisions

Obtained from: KAME
Reviewed by: ume, gnn
MFC after: 2 month


# 151468 19-Oct-2005 suz

added an ioctl option in kernel so that ndp/rtadvd can change some NDP-related kernel variables based on their configurations (RFC2461 p.43 6.2.1 mandates this for IPv6 routers)

Obtained from: KAME
Reviewd by: ume, gnn
MFC after: 2 weeks


# 151465 19-Oct-2005 suz

sync with KAME in the following points:
- fixed typos
- improved some comment descriptions
- use NULL, instead of 0, to denote a NULL pointer
- avoid embedding a magic number in the code
- use nd6log() instead of log() to record NDP-specific logs
- nuked an unnecessay white space

Obtained from: KAME
MFC after: 1 day


# 149849 07-Sep-2005 obrien

IPv6 was improperly defining its malloc type the same as IPv4 (M_IPMADDR,
M_IPMOPTS, M_MRTABLE). Thus we had conflicting instantiations.
Create an IPv6-specific type to overcome this.


# 148887 09-Aug-2005 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days


# 148385 25-Jul-2005 ume

scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.

Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from: KAME


# 146883 02-Jun-2005 iedowse

Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interface
if_ioctl routine. This should fix a number of code paths through
soo_ioctl() that could call into Giant-locked network drivers without
first acquiring Giant.


# 142215 22-Feb-2005 glebius

Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by: mlaier
Obtained from: OpenBSD (mickey, mcbride)


# 139826 07-Jan-2005 imp

/* -> /*- for license, minor formatting changes, separate for KAME


# 134188 23-Aug-2004 rwatson

Remove in6_prefix.[ch] and the contained router renumbering capability.
The prefix management code currently resides in nd6, leaving only the
unused router renumbering capability in the in6_prefix files. Removing
it will make it easier for us to provide locking for the remainder of
IPv6 by reducing the number of objects requiring synchronized access.

This functionality has also been removed from NetBSD and OpenBSD.

Submitted by: George Neville-Neil <gnn at neville-neil.com>
Discussed with/approved by: suz, keiichi at kame.net, core at kame.net


# 128019 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 126603 04-Mar-2004 ume

move in6_addmulti()/in6_delmulti() into mld6.c

Obtained from: KAME


# 126595 04-Mar-2004 ume

missing splx().

Obtained from: KAME
MFC after: 3 days


# 126552 03-Mar-2004 ume

- stlye and comments
- variable name change (scopeid -> zoneid)
- u_short -> u_int16_t, u_char -> u_int8_t

Obtained from: KAME


# 126264 26-Feb-2004 mlaier

Bring eventhandler callbacks for pf.
This enables pf to track dynamic address changes on interfaces (dailup) with
the "on (<ifname>)"-syntax. This also brings hooks in anticipation of
tracking cloned interfaces, which will be in future versions of pf.

Approved by: bms(mentor)


# 126184 24-Feb-2004 cperciva

Fix array overflow: If len=128, don't access [16] of a 16-byte IPv6
address, even if we subsequently ignore its value by applying a >>8
to it.

Reported by: "Ted Unangst" <tedu@coverity.com>
Approved by: rwatson (mentor), {ume, suz} (KAME)


# 124337 10-Jan-2004 ume

try rtinit() only when the route is not installed.
this allows, e.g., duplicated attempts of 'ifconfig lo0 ::1'
like for IPv4.

Obtained from: KAME
MFC after: 1 week


# 122334 08-Nov-2003 sam

replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF
macros that expand to include assertions when the system is built
with INVARIANTS

Supported by: FreeBSD Foundation


# 122128 05-Nov-2003 ume

byebye in6_ifawithscope(). it was a function for old source
address selection.

Obtained from: KAME


# 122059 04-Nov-2003 ume

use nd6log().

Obtained from: KAME


# 121742 30-Oct-2003 ume

add management part of address selection policy described in
RFC3484.

Obtained from: KAME


# 121716 29-Oct-2003 sam

correct LOR by using a local variable to hold result
instead of holding a lock while calling out of view

Supported by: FreeBSD Foundation


# 121315 21-Oct-2003 ume

- change scope to zone.
- change node-local to interface-local.
- better error handling of address-to-scope mapping.
- use in6_clearscope().

Obtained from: KAME


# 121283 20-Oct-2003 ume

correct linkmtu handling.

Obtained from: KAME


# 121168 17-Oct-2003 ume

nuke duplicate function and unused function.

Obtained from: KAME


# 121167 17-Oct-2003 ume

revert wrongly dropped null check by previous commit.


# 121161 17-Oct-2003 ume

- add dom_if{attach,detach} framework.
- transition to use ifp->if_afdata.

Obtained from: KAME


# 120971 10-Oct-2003 ume

nuke SCOPEDROUTING. Though it was there for a long time,
it was never enabled.


# 120891 07-Oct-2003 ume

- fix typo in comment.
- style.

Obtained from: KAME


# 120856 06-Oct-2003 ume

return(code) -> return (code)
(reduce diffs against KAME)


# 120727 04-Oct-2003 sam

Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents. Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
may be returned; this was never used and added unwarranted complexity
for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
we assume the parent will remain as long as the clone; doing this avoids
a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
applications cannot/do-no know about mutex's. Doing this requires
that the mutex be the last element in the structure. A better solution
is to introduce an externalized version of struct rtentry but this is
a major task because of the intertwining of rtentry and other data
structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
work to eliminate many held references. If not these will be resolved
prior to release.
3. ATM changes are untested.

Sponsored by: FreeBSD Foundation
Obtained from: BSD/OS (partly)


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 108269 25-Dec-2002 ru

If the caller of rtrequest*(RTM_DELETE, ...) asked for a copy of
the entry being removed (ret_nrt != NULL), increment the entry's
rt_refcnt like we do it for RTM_ADD and RTM_RESOLVE, rather than
messing around with 1->0 transitions for rtfree() all over.


# 108172 22-Dec-2002 hsu

SMP locking for ifnet list.


# 108033 18-Dec-2002 hsu

Lock up ifaddr reference counts.


# 95023 19-Apr-2002 suz

just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by: ume
MFC after: 1 week


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 91346 27-Feb-2002 alfred

Fix warnings caused by discarding const.

Hairy Eyeball At: peter


# 83934 25-Sep-2001 brooks

Make faith loadable, unloadable, and clonable.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 83130 06-Sep-2001 jlemon

Wrap array accesses in macros, which also happen to be lvalues:

ifnet_addrs[i - 1] -> ifaddr_byindex(i)
ifindex2ifnet[i] -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.


# 81115 03-Aug-2001 ume

When global anycast address was assigned to lo0, wrong source
address was selected.

Reported by: Shingo WATANABE <nabe@nabechan.org>
Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
MFC after: 3 days


# 79763 15-Jul-2001 ume

do not M_WAITOK in in6_update_ifa(), since this function can be called
under splnet(). (some comment was added by KAME)

PR: 28927
MFC after: 1 week


# 79106 02-Jul-2001 brooks

gif(4) and stf(4) modernization:

- Remove gif dependencies from stf.
- Make gif and stf into modules
- Make gif cloneable.

PR: kern/27983
Reviewed by: ru, ume
Obtained from: NetBSD
MFC after: 1 week


# 78064 11-Jun-2001 ume

Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.

Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks


# 71207 18-Jan-2001 itojun

workaround; be sure to initialize nd6 interface information when IPv6
interface address gets added. this will avoid presenting EMSGSIZE when
outgoing interface is down (and never brought up).

sync with kame.


# 63001 12-Jul-2000 itojun

correct rtentry reference count in in6_ifloop_request().
if you reconfigure inet6 too much, the reference count can go
into negative by mistake. KAME in6.c 1.98 -> 1.99.


# 62744 07-Jul-2000 grog

Suppress a warning message about trigraphs.

Approved-by: itojun


# 62587 04-Jul-2000 itojun

sync with kame tree as of july00. tons of bug fixes/improvements.

API changes:
- additional IPv6 ioctls
- IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8).
(also syntax change)


# 57017 07-Feb-2000 shin

Permit site local addr in IPv6 source address selection rule.

KAME source addr selection rule had a problem to treat IPv6 site
local addr.
The rule is completely rewritten recently and the above problem
is also fixed, but rewriting same code part in freebsd4.0 is too
dangerous in this stage, so just add workaround to avoid
the problem. Just add code for IPv6 site local addresses into IPv6
source addr selection algorythm part.


# 56669 27-Jan-2000 shin

Added ip6_forwarding check when prefix related ioctl is called.
(prefix related ioctl should only be called on router,
because host use dynamic address and prefix configuration mechanism,
and those prefix are managed separately with ones whih are assined
manually.)


# 55917 13-Jan-2000 shin

Change struct sockaddr_storage member name, because following change
is very likely to become consensus as recent ietf/ipng mailing list
discussion. Also recent KAME repository and other KAME patched BSDs
also applied it.

s/__ss_family/ss_family/
s/__ss_len/ss_len/

Makeworld is confirmed, and no application should be affected by this change
yet.


# 55344 03-Jan-2000 shin

prevent kernel panic at suspend/resume.

confirmed by: sanpei, joe

PR: kern/15742


# 54263 07-Dec-1999 shin

udp IPv6 support, IPv6/IPv4 tunneling support in kernel,
packet divert at kernel for IPv6/IPv4 translater daemon

This includes queue related patch submitted by jburkhol@home.com.

Submitted by: queue related patch from jburkhol@home.com
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project


# 53957 30-Nov-1999 shin

Just to avoid warning message about trigraph.

Commented by: green


# 53541 22-Nov-1999 shin

KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP
for IPv6 yet)

With this patch, you can assigne IPv6 addr automatically, and can reply to
IPv6 ping.

Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project