History log of /netbsd-current/sys/net/if_stf.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.109 03-Sep-2022 thorpej

Garbage-collect the remaining vestiges of netisr.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.108 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.107 29-Jan-2020 thorpej

branches: 1.107.10;
Adopt <net/if_stats.h>.


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base2 ad-namecache-base1 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609
# 1.106 26-Apr-2019 pgoyette

branches: 1.106.4;
Some more empty-string --> NULL conversions for module dependencies


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base
# 1.105 26-Jun-2018 msaitoh

branches: 1.105.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.104 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.103 15-Nov-2017 knakahara

branches: 1.103.2;
Add argument to encapsw->pr_input() instead of m_tag.


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

branches: 1.101.8;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.108 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.107 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base2 ad-namecache-base1 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609
# 1.106 26-Apr-2019 pgoyette

branches: 1.106.4;
Some more empty-string --> NULL conversions for module dependencies


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base
# 1.105 26-Jun-2018 msaitoh

branches: 1.105.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.104 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.103 15-Nov-2017 knakahara

branches: 1.103.2;
Add argument to encapsw->pr_input() instead of m_tag.


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

branches: 1.101.8;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.107 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609
# 1.106 26-Apr-2019 pgoyette

Some more empty-string --> NULL conversions for module dependencies


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base
# 1.105 26-Jun-2018 msaitoh

branches: 1.105.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.104 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.103 15-Nov-2017 knakahara

branches: 1.103.2;
Add argument to encapsw->pr_input() instead of m_tag.


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

branches: 1.101.8;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.106 26-Apr-2019 pgoyette

Some more empty-string --> NULL conversions for module dependencies


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base
# 1.105 26-Jun-2018 msaitoh

Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.


Revision tags: pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502
# 1.104 01-May-2018 maxv

Remove now unused net_osdep.h includes, the other BSDs did the same.


Revision tags: pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.103 15-Nov-2017 knakahara

branches: 1.103.2;
Add argument to encapsw->pr_input() instead of m_tag.


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

branches: 1.101.8;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-2-RELEASE netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


Revision tags: tls-maxphys-base-20171202
# 1.103 15-Nov-2017 knakahara

Add argument to encapsw->pr_input() instead of m_tag.


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

branches: 1.101.8;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.102 23-Oct-2017 msaitoh

- If if_attach() failed in the attach function, free resources and return.
- KNF


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.101 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).


# 1.101 12-Dec-2016 ozaki-r

Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview


# 1.100 08-Dec-2016 ozaki-r

Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914
# 1.99 18-Aug-2016 knakahara

eliminate stf(4)'s dependency on gif(4).

stf(4) depends on not gif(4) but ip_encap.


# 1.98 07-Aug-2016 christos

modularize some more drivers and merge the module glue


Revision tags: pgoyette-localcount-20160806
# 1.97 01-Aug-2016 ozaki-r

Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.


Revision tags: pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.96 08-Jul-2016 ozaki-r

branches: 1.96.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.


# 1.95 07-Jul-2016 ozaki-r

Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.


# 1.94 06-Jul-2016 ozaki-r

Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.


# 1.93 04-Jul-2016 knakahara

make encap_lock_{enter,exit} interruptable.


# 1.92 04-Jul-2016 knakahara

let gif(4) promise softint(9) contract (2/2) : ip_encap side

The last commit does not care encaptab. This commit fixes encaptab race which
is used not only gif(4).


# 1.91 22-Jun-2016 ozaki-r

Remove unnecessary NULL checks of ifa->ifa_addr

If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.


# 1.90 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.89 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.88 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.87 28-Jan-2016 knakahara

fix my wrong modification


# 1.86 26-Jan-2016 knakahara

implement encapsw instead of protosw and uniform prototype.

suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...


# 1.85 22-Jan-2016 riastradh

Back out previous change to introduce struct encapsw.

This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.


# 1.84 22-Jan-2016 riastradh

Don't abuse struct protosw for ip_encap -- introduce struct encapsw.

Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.


# 1.83 20-Jan-2016 riastradh

Eliminate struct protosw::pr_output.

You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html


Revision tags: nick-nhusb-base-20151226 nick-nhusb-base-20150921
# 1.82 24-Aug-2015 pooka

sprinkle _KERNEL_OPT


# 1.81 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


Revision tags: netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.80 12-Jun-2014 christos

branches: 1.80.4;
PR/48901: Fail at compile time when trying to compile stf without inet6,
and print an explanatory message.


# 1.79 05-Jun-2014 rmind

- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.


Revision tags: rmind-smpnet-nbase rmind-smpnet-base
# 1.78 18-May-2014 rmind

Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE yamt-pagecache-base9 yamt-pagecache-tag8 netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 agc-symver-base netbsd-6-1-RC2 netbsd-6-1-RC1 yamt-pagecache-base8 netbsd-6-0-1-RELEASE yamt-pagecache-base7 matt-nb6-plus-nbase yamt-pagecache-base6 netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.77 28-Oct-2011 dyoung

branches: 1.77.12; 1.77.16; 1.77.26;
Don't kauth-orize SIOCSIFMTU in pppsioctl() and stf_ioctl(), ifioctl()
has already done that for us.


# 1.76 17-Jul-2011 joerg

Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 rmind-uvmplock-base
# 1.75 05-Apr-2010 joerg

Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.74 19-Jan-2010 pooka

branches: 1.74.2; 1.74.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211
# 1.73 08-Nov-2009 christos

PR/42285: PR/41559: Daniel Hagerty: if_stf doesn't count output bytes


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.72 18-Apr-2009 tsutsui

Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch


# 1.71 15-Apr-2009 elad

Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.


# 1.70 18-Mar-2009 cegger

bcopy -> memcpy


# 1.69 18-Mar-2009 cegger

bcmp -> memcmp


Revision tags: nick-hppapmap-base2 haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.68 07-Nov-2008 dyoung

branches: 1.68.4;
*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-2-3-RELEASE netbsd-5-1-5-RELEASE netbsd-5-2-2-RELEASE netbsd-5-1-4-RELEASE netbsd-5-2-1-RELEASE netbsd-5-1-3-RELEASE netbsd-5-2-RELEASE netbsd-5-2-RC1 netbsd-5-1-2-RELEASE netbsd-5-1-1-RELEASE matt-nb5-mips64-premerge-20101231 matt-nb5-pq3-base netbsd-5-1-RELEASE netbsd-5-1-RC4 matt-nb5-mips64-k15 netbsd-5-1-RC3 netbsd-5-1-RC2 netbsd-5-1-RC1 netbsd-5-0-2-RELEASE matt-nb5-mips64-premerge-20091211 matt-nb5-mips64-u2-k2-k4-k7-k8-k9 matt-nb4-mips64-k7-u2a-k9b matt-nb5-mips64-u1-k1-k5 netbsd-5-0-1-RELEASE netbsd-5-0-RELEASE netbsd-5-0-RC4 netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2
# 1.67 24-Oct-2008 dyoung

branches: 1.67.2;
Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.


Revision tags: haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.66 15-Jun-2008 christos

branches: 1.66.2;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.65 20-Feb-2008 matt

branches: 1.65.6; 1.65.8; 1.65.10; 1.65.12; 1.65.14;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: mjf-devfs-base
# 1.64 07-Feb-2008 dyoung

Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.


Revision tags: vmlocking2-base3 bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.63 20-Dec-2007 dyoung

Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.62 19-Oct-2007 ad

branches: 1.62.2; 1.62.4; 1.62.8;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.61 01-Sep-2007 dyoung

branches: 1.61.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.


Revision tags: matt-mips64-base nick-csl-alignment-base yamt-idlelwp-base8 mjf-ufs-trans-base
# 1.60 02-May-2007 dyoung

branches: 1.60.2; 1.60.6; 1.60.8;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.


Revision tags: thorpej-atomic-base
# 1.59 29-Mar-2007 ad

lwp::l_acflag is no longer used.


# 1.58 04-Mar-2007 christos

branches: 1.58.2; 1.58.4; 1.58.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.57 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 newlock2-base
# 1.56 15-Dec-2006 joerg

branches: 1.56.2;
Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.


Revision tags: yamt-splraiseipl-base3
# 1.55 09-Dec-2006 dyoung

Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.54 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.53 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.52 23-Jul-2006 ad

branches: 1.52.4; 1.52.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.51 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.50 11-Dec-2005 thorpej

branches: 1.50.4; 1.50.6; 1.50.8; 1.50.10; 1.50.12;
ANSI function decls and application of static.


# 1.49 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.48 02-Jun-2005 tron

branches: 1.48.2;
Change the first argument of the encapsulation check function from
"const struct mbuf *" to "struct mbuf *". Without this change the
actual implementation cannot even use m_copydata() on the mbuf chain
which is broken.


# 1.47 02-Jun-2005 tron

Remove type casts and lint directives which are now longer necessary
because the first argument of m_copydata() is "const struct mbuf *" now.


Revision tags: netbsd-3-1-1-RELEASE netbsd-3-0-3-RELEASE netbsd-3-1-RELEASE netbsd-3-0-2-RELEASE netbsd-3-1-RC4 netbsd-3-1-RC3 netbsd-3-1-RC2 netbsd-3-1-RC1 netbsd-3-0-1-RELEASE netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.46 11-Mar-2005 tron

Add support for changing the MTU to stf(4).


# 1.45 26-Feb-2005 perry

nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.44 25-Jan-2005 matt

Switch to using ifa for ifaddr's instead of ia (which are traditionally
used for in_ifaddr's) which could lead to confusion.


Revision tags: yamt-km-base
# 1.43 25-Jan-2005 tron

branches: 1.43.2;
Fix cut and paste error in last commit.


# 1.42 24-Jan-2005 matt

Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.


Revision tags: kent-audio1-beforemerge kent-audio1-base
# 1.41 04-Dec-2004 peter

branches: 1.41.4;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.40 19-Aug-2004 christos

Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.


# 1.39 26-Apr-2004 matt

Remove #else of #if __STDC__


# 1.38 22-Apr-2004 matt

Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.


# 1.37 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.36 12-Nov-2003 cl

branches: 1.36.4;
catch up with in_ifaddr -> in_ifaddrhead rename


# 1.35 22-Aug-2003 itojun

change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.


# 1.34 15-Aug-2003 jonathan

(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.


# 1.33 01-May-2003 itojun

branches: 1.33.2;
bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.32 17-Nov-2002 itojun

more pickier packet validation, based on
draft-savola-v6ops-6to4-security-00.txt. sync w/kame


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.31 17-Sep-2002 itojun

fix comment, sync with kame


# 1.30 17-Sep-2002 itojun

reject SIOCAIFADDR if embedded address is in private address range. sync w/kame


Revision tags: gehenna-devsw-base
# 1.29 14-Aug-2002 itojun

avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.


# 1.28 06-Aug-2002 itojun

backout previous. i was looking at the wrong RFC.


# 1.27 05-Aug-2002 itojun

based on RFC2529, stf(4) should have 1480 as MTU, not 1280.
tron found it, sync w/kame


# 1.26 23-Jul-2002 tron

Increase interface output error count in case of a failure.


# 1.25 23-Jul-2002 tron

Increase interface output counter for every encapsulated packet sent to IP.


# 1.24 20-Jun-2002 itojun

reject packets with IPv4 private address range. sync w/kame


Revision tags: netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.23 21-Dec-2001 itojun

branches: 1.23.8; 1.23.10;
move protosw fragment for gif/stf to their own source code.
reduce #ifdef in stf code. sync with kame


# 1.22 13-Nov-2001 lukem

remove unnecessary #if NFOO > 0 .... #endif wrappers


# 1.21 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.20 06-Nov-2001 itojun

too many curly brace.


# 1.19 06-Nov-2001 matt

Fix pr#14481


# 1.18 05-Nov-2001 matt

Switch to using queue access macros instead of refering to the member
fields explicitly.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.17 18-Jul-2001 thorpej

branches: 1.17.4;
bzero -> memset


# 1.16 08-Jun-2001 itojun

branches: 1.16.2;
inject outgoing packet to bpf. KAME PR 358.


# 1.15 10-May-2001 itojun

correct ecn consideration on tunnel encap/decap. sync with kame.


# 1.14 29-Apr-2001 itojun

correct outbound outer IPv4 destination address selection.
IFF_LINK0 disables inbound path, removes security worries.
more examples in manpage.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.13 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.12 20-Feb-2001 itojun

branches: 1.12.2;
explicitly use u_int32_t for DLT_NULL encapsulation.

correct gif address family. from chopps, sync with kame.


# 1.11 17-Feb-2001 itojun

update comment to meet 6to4 RFC. sync with kame


# 1.10 22-Jan-2001 itojun

make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2. (part of) PR 11163 from Ken Raeburn.


# 1.9 17-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


# 1.8 17-Jan-2001 thorpej

Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).


# 1.7 18-Dec-2000 thorpej

Fill in if_dlt.


# 1.6 12-Dec-2000 thorpej

Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.


# 1.5 05-Jul-2000 thorpej

branches: 1.5.2;
stf(4) is now a cloning network interface (although, only one is allowed
to be created).


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.4 10-Jun-2000 itojun

branches: 1.4.2;
update i-d #. (sync with kame)


Revision tags: minoura-xpg4dl-base
# 1.3 14-May-2000 itojun

branches: 1.3.2;
sync IPv4 rogue address filter with RFC1122. (sync with kame)


# 1.2 21-Apr-2000 itojun

update comment (analysis on 04 draft)


# 1.1 19-Apr-2000 itojun

introduce sys/netinet/ip_encap.c, to dispatch inbound packets
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.

fix ipip to work with gif (using ip_encap.c). sorry for breakage.

gif now uses ip_encap.c.

introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).