History log of /netbsd-current/sys/netinet6/in6_gif.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.96 07-Dec-2022 knakahara

gif(4), ipsec(4) and l2tp(4) use encap_attach_addr().


Revision tags: bouyer-sunxi-drm-base thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.95 30-Oct-2019 knakahara

Add sysctl nodes to control fragmentation with IPv[46] over IPv6 gif(4).

New sysctl node "net.inet6.ip6.gifpmtu" means
- 0 (default)
Fragment by IPV6_MMTU. All packets reach the destination certainly,
however the long packet performance is poor.
This is same behavior as before.
- 1
Fragment by outer interface's MTU. The long packet performance would
be good, however the packets may be dropped in some network paths
whose path MTU less than the interface's MTU.
- others
undefined yet

New sysctl node "net.interfaces.gif*.pmtu" means
- -1 (default)
Use system default value (net.inet6.ip6.gifpmtu).
- 0
Fragment by IPV6_MMTU for this gif(4) tunnel.
- 1
Fragment by outer interface's MTU for this gif(4) tunnel.
- others
undefined yet

See RFC4459 for more information and other solutions.


# 1.94 19-Sep-2019 knakahara

Avoid having a rtcache directly in a percpu storage for tunnel protocols.

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.
A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by ozaki-r@ and yamaguchi@


Revision tags: netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.93 01-May-2018 maxv

branches: 1.93.2; 1.93.6;
Remove now unused net_osdep.h includes, the other BSDs did the same.


# 1.92 27-Apr-2018 knakahara

Fix LOCKDEBUG kernel panic when many(about 200) tunnel interfaces is created.

The tunnel interfaces are gif(4), l2tp(4), and ipsecif(4). They use mutex
itself in percpu area. When percpu_cpu_enlarge() run, the address of the
mutex in percpu area becomes different from the address which lockdebug
saved. That can cause "already initialized" false detection.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315
# 1.91 14-Mar-2018 knakahara

Fix error checking in in6_gif_ctlinput().

if_gif.c:r1.133 introduces gif_update_variant() which ensure ifp->if_flags
is set IFF_RUNNING when gif_softc->gif_var->gv_{psrc,pdst} are not null.
So, in6_gif_ctlinput() is not required IFF_RUNNING checking. In contrast,
it is required gv_{psrc,pdst} NULL checking.


Revision tags: pgoyette-compat-base
# 1.90 10-Jan-2018 knakahara

branches: 1.90.2;
apply in{,6}_tunnel_validate() to gif(4).


Revision tags: tls-maxphys-base-20171202
# 1.89 27-Nov-2017 knakahara

IFF_RUNNING checking in Rx and Tx processing is unnecessary now.

Because the configs of gif (members of gif_var) are protected by psref(9).


# 1.88 27-Nov-2017 knakahara

preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4).

After Tx side does not use softint, gif(4) can use psref(9) for config
preservation like vlan(4) and l2tp(4).

update locking notes later.


# 1.87 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

branches: 1.85.6;
ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.95 30-Oct-2019 knakahara

Add sysctl nodes to control fragmentation with IPv[46] over IPv6 gif(4).

New sysctl node "net.inet6.ip6.gifpmtu" means
- 0 (default)
Fragment by IPV6_MMTU. All packets reach the destination certainly,
however the long packet performance is poor.
This is same behavior as before.
- 1
Fragment by outer interface's MTU. The long packet performance would
be good, however the packets may be dropped in some network paths
whose path MTU less than the interface's MTU.
- others
undefined yet

New sysctl node "net.interfaces.gif*.pmtu" means
- -1 (default)
Use system default value (net.inet6.ip6.gifpmtu).
- 0
Fragment by IPV6_MMTU for this gif(4) tunnel.
- 1
Fragment by outer interface's MTU for this gif(4) tunnel.
- others
undefined yet

See RFC4459 for more information and other solutions.


# 1.94 19-Sep-2019 knakahara

Avoid having a rtcache directly in a percpu storage for tunnel protocols.

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.
A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by ozaki-r@ and yamaguchi@


Revision tags: netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.93 01-May-2018 maxv

branches: 1.93.6;
Remove now unused net_osdep.h includes, the other BSDs did the same.


# 1.92 27-Apr-2018 knakahara

Fix LOCKDEBUG kernel panic when many(about 200) tunnel interfaces is created.

The tunnel interfaces are gif(4), l2tp(4), and ipsecif(4). They use mutex
itself in percpu area. When percpu_cpu_enlarge() run, the address of the
mutex in percpu area becomes different from the address which lockdebug
saved. That can cause "already initialized" false detection.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315
# 1.91 14-Mar-2018 knakahara

Fix error checking in in6_gif_ctlinput().

if_gif.c:r1.133 introduces gif_update_variant() which ensure ifp->if_flags
is set IFF_RUNNING when gif_softc->gif_var->gv_{psrc,pdst} are not null.
So, in6_gif_ctlinput() is not required IFF_RUNNING checking. In contrast,
it is required gv_{psrc,pdst} NULL checking.


Revision tags: pgoyette-compat-base
# 1.90 10-Jan-2018 knakahara

branches: 1.90.2;
apply in{,6}_tunnel_validate() to gif(4).


Revision tags: tls-maxphys-base-20171202
# 1.89 27-Nov-2017 knakahara

IFF_RUNNING checking in Rx and Tx processing is unnecessary now.

Because the configs of gif (members of gif_var) are protected by psref(9).


# 1.88 27-Nov-2017 knakahara

preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4).

After Tx side does not use softint, gif(4) can use psref(9) for config
preservation like vlan(4) and l2tp(4).

update locking notes later.


# 1.87 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

branches: 1.85.6;
ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.94 19-Sep-2019 knakahara

Avoid having a rtcache directly in a percpu storage for tunnel protocols.

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.
A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by ozaki-r@ and yamaguchi@


Revision tags: netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.93 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


# 1.92 27-Apr-2018 knakahara

Fix LOCKDEBUG kernel panic when many(about 200) tunnel interfaces is created.

The tunnel interfaces are gif(4), l2tp(4), and ipsecif(4). They use mutex
itself in percpu area. When percpu_cpu_enlarge() run, the address of the
mutex in percpu area becomes different from the address which lockdebug
saved. That can cause "already initialized" false detection.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315
# 1.91 14-Mar-2018 knakahara

Fix error checking in in6_gif_ctlinput().

if_gif.c:r1.133 introduces gif_update_variant() which ensure ifp->if_flags
is set IFF_RUNNING when gif_softc->gif_var->gv_{psrc,pdst} are not null.
So, in6_gif_ctlinput() is not required IFF_RUNNING checking. In contrast,
it is required gv_{psrc,pdst} NULL checking.


Revision tags: pgoyette-compat-base
# 1.90 10-Jan-2018 knakahara

branches: 1.90.2;
apply in{,6}_tunnel_validate() to gif(4).


Revision tags: tls-maxphys-base-20171202
# 1.89 27-Nov-2017 knakahara

IFF_RUNNING checking in Rx and Tx processing is unnecessary now.

Because the configs of gif (members of gif_var) are protected by psref(9).


# 1.88 27-Nov-2017 knakahara

preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4).

After Tx side does not use softint, gif(4) can use psref(9) for config
preservation like vlan(4) and l2tp(4).

update locking notes later.


# 1.87 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

branches: 1.85.6;
ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.93 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


# 1.92 27-Apr-2018 knakahara

Fix LOCKDEBUG kernel panic when many(about 200) tunnel interfaces is created.

The tunnel interfaces are gif(4), l2tp(4), and ipsecif(4). They use mutex
itself in percpu area. When percpu_cpu_enlarge() run, the address of the
mutex in percpu area becomes different from the address which lockdebug
saved. That can cause "already initialized" false detection.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315
# 1.91 14-Mar-2018 knakahara

Fix error checking in in6_gif_ctlinput().

if_gif.c:r1.133 introduces gif_update_variant() which ensure ifp->if_flags
is set IFF_RUNNING when gif_softc->gif_var->gv_{psrc,pdst} are not null.
So, in6_gif_ctlinput() is not required IFF_RUNNING checking. In contrast,
it is required gv_{psrc,pdst} NULL checking.


Revision tags: pgoyette-compat-base
# 1.90 10-Jan-2018 knakahara

branches: 1.90.2;
apply in{,6}_tunnel_validate() to gif(4).


Revision tags: tls-maxphys-base-20171202
# 1.89 27-Nov-2017 knakahara

IFF_RUNNING checking in Rx and Tx processing is unnecessary now.

Because the configs of gif (members of gif_var) are protected by psref(9).


# 1.88 27-Nov-2017 knakahara

preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4).

After Tx side does not use softint, gif(4) can use psref(9) for config
preservation like vlan(4) and l2tp(4).

update locking notes later.


# 1.87 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

branches: 1.85.6;
ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.90 10-Jan-2018 knakahara

apply in{,6}_tunnel_validate() to gif(4).


Revision tags: tls-maxphys-base-20171202
# 1.89 27-Nov-2017 knakahara

IFF_RUNNING checking in Rx and Tx processing is unnecessary now.

Because the configs of gif (members of gif_var) are protected by psref(9).


# 1.88 27-Nov-2017 knakahara

preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4).

After Tx side does not use softint, gif(4) can use psref(9) for config
preservation like vlan(4) and l2tp(4).

update locking notes later.


# 1.87 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

branches: 1.85.6;
ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.86 21-Sep-2017 knakahara

add lock for percpu route like l2tp(4).


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.85 16-Jan-2017 christos

ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

branches: 1.83.2;
remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.85 16-Jan-2017 christos

ip6_sprintf -> IN6_PRINT so that we pass the size.


# 1.84 16-Jan-2017 ryo

Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


Revision tags: pgoyette-localcount-20170107
# 1.83 06-Jan-2017 knakahara

remove unnecessary conversion.

gif_softc->gif_pdst is already valid sockaddr.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.


# 1.82 14-Dec-2016 knakahara

fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.

make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).


# 1.81 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.80 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.79 15-Jul-2016 ozaki-r

Use sin6tosa and sin6tocsa macros

No functional change.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907
# 1.78 06-Jul-2016 ozaki-r

branches: 1.78.2;
Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)


# 1.77 04-Jul-2016 knakahara

fix: gif(4) receive side race

A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.


# 1.76 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (1/2) : gif(4) side

To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
+ ioctl (writing configuration) side
- off IFF_RUNNING flag before changing configuration
- wait softint handler completion before changing configuration
+ packet processing (reading configuraiotn) side
- if IFF_RUNNING flag is on, do nothing
+ in whole
- add gif_list_lock_{enter,exit} to prevent the same configuration is
set to other gif(4) interfaces


# 1.75 28-Jun-2016 ozaki-r

Add missing NULL checks for m_get_rcvif_psref


# 1.74 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


Revision tags: nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.73 29-Feb-2016 knakahara

remove unnecessary declarations and fix KNF

Thanks to riastradh@


# 1.72 26-Feb-2016 knakahara

To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().


# 1.71 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.70 23-Jan-2016 riastradh

Those were local changes not meant to be part of the revert. SORRY!


# 1.69 23-Jan-2016 christos

make this compile again


# 1.68 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.67 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.66 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


# 1.65 18-Jan-2016 knakahara

Refactor protosw codes in gif(4). No functional change.

- remove unnecessary include
- reduce scopes


Revision tags: nick-nhusb-base-20151226
# 1.64 25-Dec-2015 knakahara

use satosin{,6} macros instead of casts.


# 1.63 11-Dec-2015 knakahara

PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.

It is required to wait other CPU's softint completion before disestablishing
the softint handler.


Revision tags: nick-nhusb-base-20150921
# 1.62 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


Revision tags: nick-nhusb-base-20150606
# 1.61 24-Apr-2015 ozaki-r

Add missing rtcache_free

It's the same as other similar code paths in in_gif and ip6_etherip.


Revision tags: netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base rmind-smpnet-nbase rmind-smpnet-base tls-maxphys-base
# 1.60 18-May-2014 rmind

branches: 1.60.4;
Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base
# 1.59 01-Mar-2013 joerg

branches: 1.59.6; 1.59.10;
Retire OSI network stack. OK core@


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.58 14-Mar-2009 dsl

branches: 1.58.12; 1.58.22;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.57 07-Nov-2008 dyoung

branches: 1.57.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2 wrstuden-revivesa-base yamt-nfs-mp-base
# 1.56 24-Apr-2008 ad

branches: 1.56.2; 1.56.8; 1.56.10;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.55 15-Apr-2008 thorpej

branches: 1.55.2;
Make ip6 and icmp6 stats per-cpu.


# 1.54 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-nbase mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.53 20-Dec-2007 dyoung

branches: 1.53.6;
Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: nick-csl-alignment-base5 matt-armv6-prevmlocking yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base jmcneill-pm-base nick-csl-alignment-base reinoud-bufcleanup-base mjf-ufs-trans-base vmlocking-base
# 1.52 23-May-2007 christos

branches: 1.52.8; 1.52.16; 1.52.20;
Ansify + add a few comments, from Karl Sj��dahl


Revision tags: yamt-idlelwp-base8
# 1.51 02-May-2007 dyoung

Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.50 04-Mar-2007 christos

branches: 1.50.2; 1.50.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


# 1.48 17-Feb-2007 dyoung

branches: 1.48.2;
Don't open-code LIST_FOREACH().


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.47 15-Dec-2006 joerg

Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.46 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base abandoned-netbsd-4-base yamt-splraiseipl-base2 yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 netbsd-4-base yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base rpaulo-netinet-merge-pcb-base
# 1.45 07-Jun-2006 kardel

branches: 1.45.6; 1.45.8;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5 simonb-timecounters-base
# 1.44 11-Dec-2005 christos

branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.43 26-Jun-2005 mlelstv

branches: 1.43.2;
expire cached route. Fixes PR 22792.


# 1.42 02-Jun-2005 tron

Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.41 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


# 1.40 29-May-2005 christos

- avoid shadowed variables
- sprinkle const.


Revision tags: netbsd-3-0-3-RELEASE netbsd-3-0-2-RELEASE netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.39 26-Feb-2005 perry

branches: 1.39.2;
nuke trailing whitespace


Revision tags: yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.38 22-Apr-2004 matt

branches: 1.38.4; 1.38.6;
Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.37 30-Oct-2003 simonb

branches: 1.37.4;
Remove some assigned-to but otherwise unused variables.


# 1.36 05-Sep-2003 itojun

u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 22-Aug-2003 jonathan

Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.33 25-Nov-2002 thorpej

branches: 1.33.6;
Avoid strict-alias warnings.


# 1.32 11-Nov-2002 itojun

make USE_ENCAPCHECK (in netinet*/*gif.c) to global option, GIF_ENCAPCHECK.
#ifdef out unneeded code when possible.
From: Krister Walfridsson <cato@df.lth.se>


# 1.31 05-Nov-2002 itojun

improve gif lookup performance, when there are many of those,
by using radix tree for lookups. tested by yshimizu@iij.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.30 11-Sep-2002 itojun

KNF - return is not a function. sync w/kame.


Revision tags: gehenna-devsw-base
# 1.29 09-Jun-2002 itojun

whitespace cleanup


# 1.28 08-Jun-2002 itojun

whitespace cleanup


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.27 21-Dec-2001 itojun

branches: 1.27.8;
use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame


# 1.26 21-Dec-2001 itojun

move in6_gif_hlim decl to in6_gif.c. sync with kame


# 1.25 21-Dec-2001 itojun

move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.24 20-Dec-2001 itojun

centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame


# 1.23 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.22 16-Aug-2001 itojun

gif interface now uses generic software interrupt
(on archs that support it). also, make gif ALTQ-capable on outgoing.
sync with kame, comments from thorpej.


# 1.21 29-Jul-2001 itojun

sync gif interface code with latest kame.
IFF_RUNNING is clearified. attach/detach logic is more clearner.
the old code mistakenly set IFF_UP by itself, now the behavior is gone.


# 1.20 14-May-2001 itojun

branches: 1.20.2;
drop multi destination mode (IFF_LINK0).


# 1.19 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.18 20-Feb-2001 itojun

branches: 1.18.2;
add AF_ISO case to output. from chopps.


# 1.17 20-Feb-2001 itojun

ISO over IPv4/v6 by EON encapsulation. from chopps, sync with kame.


# 1.16 11-Feb-2001 itojun

remove #ifdef __FreeBSD__.


# 1.15 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.14 19-Apr-2000 itojun

branches: 1.14.4;
introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.13 01-Mar-2000 itojun

introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)


Revision tags: chs-ubc2-newbase
# 1.12 07-Feb-2000 itojun

s/DIAGNOSTIC/DEBUG/


# 1.11 06-Feb-2000 itojun

fix include pathname for better rfc2292 compliance.


# 1.10 06-Jan-2000 itojun

remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.9 15-Dec-1999 itojun

do not overwrite traffic class field when we write IPv6 version field.


# 1.8 13-Dec-1999 itojun

sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)


Revision tags: comdex-fall-1999-base fvdl-softdep-base
# 1.7 20-Aug-1999 itojun

branches: 1.7.2; 1.7.8;
do not capture packets by gif, when gif interface is down.


Revision tags: chs-ubc2-base
# 1.6 31-Jul-1999 itojun

sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).


# 1.5 30-Jul-1999 itojun

remove reference to in6_systm.h (file itself will be removed afterwords)


# 1.4 09-Jul-1999 thorpej

defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).


# 1.3 03-Jul-1999 thorpej

RCS ID police.


# 1.2 01-Jul-1999 itojun

branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.


# 1.1 28-Jun-1999 itojun

branches: 1.1.2;
file in6_gif.c was initially added on branch kame.