History log of /openbsd-current/sys/netinet/ip_mroute.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.142 06-Apr-2024 bluhm

IP multicast sysctl mrtmfc must not write outside of allocation.

Reading sysctl mrt_sysctl_mfc() allocates memory to be copied back
to user. Chunks of struct mfcinfo are copied from routing table
to linear heap memory. If the allocated memory was not a multiple
the struct size, a struct mfcinfo could be copied to a partially
unallocated destination. Check that the end of the struct is within
the allocation.

From Alfredo Ortega; OK claudio@


Revision tags: OPENBSD_7_5_BASE
# 1.141 11-Feb-2024 mvs

Use `sb_mtx' instead of `inp_mtx' in receive path for inet sockets.

In soreceve(), we only touch `so_rcv' socket buffer, which has it's own
`sb_mtx' mutex(9) for protection. So, we can avoid solock() in this
path - it's enough to hold `sb_mtx' in soreceive() and around
corresponding sbappend*(). But not right now :)

This time we use shared netlock for some inet sockets in the soreceive()
path. To protect `so_rcv' buffer we use `inp_mtx' mutex(9) and the
pru_lock() to acquire this mutex(9) in socket layer. But the `inp_mtx'
mutex belongs to the PCB. We initialize socket before PCB, tcp(4)
sockets could exist without PCB, so use `sb_mtx' mutex(9) to protect
sockbuf stuff.

This diff mechanically replaces `inp_mtx' by `sb_mtx' in the receive
path. Only for sockets which already use `inp_mtx'. All other sockets
left as is. They will be converted later.

Since the `sb_mtx' is optional, the new SB_MTXLOCK flag introduced. If
this flag is set on `sb_flags', the `sb_mtx' mutex(9) should be taken.
New sb_mtx_lock() and sb_mtx_unlock() was introduced to hide this check.
They are temporary and will be replaced by mtx_enter() when all this
area will be converted to `sb_mtx' mutex(9).

Also, the new sbmtxassertlocked() function introduced to throw
corresponding assertion for SB_MTXLOCK marked buffers. This time only
sbappendaddr() calls it. This function is also temporary and will be
replaced by MTX_ASSERT_LOCKED() later.

ok bluhm


# 1.140 06-Dec-2023 bluhm

Protect socket receive buffer in IP multicast routing.

Since soreceive() runs in parallel for raw sockets, sbappendaddr()
has to be protected by inpcb mutex. This was missing in multicast
forwarding which is running with a combination of shared net lock
and kernel lock. soreceive() uses shared net lock and mutex per
inpcb. Grab mutex before sbappendaddr() in socket_send() and
socket6_send().

panic receive 1 reported by Jo Geraerts
OK mvs@ claudio@


Revision tags: OPENBSD_7_4_BASE
# 1.139 14-Jun-2023 mvs

Add missing kernel lock around (*if_ioctl)().

ok bluhm


# 1.138 19-Apr-2023 kn

move kernel lock into multicast ioctl handlers; OK mvs


Revision tags: OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.141 11-Feb-2024 mvs

Use `sb_mtx' instead of `inp_mtx' in receive path for inet sockets.

In soreceve(), we only touch `so_rcv' socket buffer, which has it's own
`sb_mtx' mutex(9) for protection. So, we can avoid solock() in this
path - it's enough to hold `sb_mtx' in soreceive() and around
corresponding sbappend*(). But not right now :)

This time we use shared netlock for some inet sockets in the soreceive()
path. To protect `so_rcv' buffer we use `inp_mtx' mutex(9) and the
pru_lock() to acquire this mutex(9) in socket layer. But the `inp_mtx'
mutex belongs to the PCB. We initialize socket before PCB, tcp(4)
sockets could exist without PCB, so use `sb_mtx' mutex(9) to protect
sockbuf stuff.

This diff mechanically replaces `inp_mtx' by `sb_mtx' in the receive
path. Only for sockets which already use `inp_mtx'. All other sockets
left as is. They will be converted later.

Since the `sb_mtx' is optional, the new SB_MTXLOCK flag introduced. If
this flag is set on `sb_flags', the `sb_mtx' mutex(9) should be taken.
New sb_mtx_lock() and sb_mtx_unlock() was introduced to hide this check.
They are temporary and will be replaced by mtx_enter() when all this
area will be converted to `sb_mtx' mutex(9).

Also, the new sbmtxassertlocked() function introduced to throw
corresponding assertion for SB_MTXLOCK marked buffers. This time only
sbappendaddr() calls it. This function is also temporary and will be
replaced by MTX_ASSERT_LOCKED() later.

ok bluhm


# 1.140 06-Dec-2023 bluhm

Protect socket receive buffer in IP multicast routing.

Since soreceive() runs in parallel for raw sockets, sbappendaddr()
has to be protected by inpcb mutex. This was missing in multicast
forwarding which is running with a combination of shared net lock
and kernel lock. soreceive() uses shared net lock and mutex per
inpcb. Grab mutex before sbappendaddr() in socket_send() and
socket6_send().

panic receive 1 reported by Jo Geraerts
OK mvs@ claudio@


Revision tags: OPENBSD_7_4_BASE
# 1.139 14-Jun-2023 mvs

Add missing kernel lock around (*if_ioctl)().

ok bluhm


# 1.138 19-Apr-2023 kn

move kernel lock into multicast ioctl handlers; OK mvs


Revision tags: OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.140 06-Dec-2023 bluhm

Protect socket receive buffer in IP multicast routing.

Since soreceive() runs in parallel for raw sockets, sbappendaddr()
has to be protected by inpcb mutex. This was missing in multicast
forwarding which is running with a combination of shared net lock
and kernel lock. soreceive() uses shared net lock and mutex per
inpcb. Grab mutex before sbappendaddr() in socket_send() and
socket6_send().

panic receive 1 reported by Jo Geraerts
OK mvs@ claudio@


Revision tags: OPENBSD_7_4_BASE
# 1.139 14-Jun-2023 mvs

Add missing kernel lock around (*if_ioctl)().

ok bluhm


# 1.138 19-Apr-2023 kn

move kernel lock into multicast ioctl handlers; OK mvs


Revision tags: OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.139 14-Jun-2023 mvs

Add missing kernel lock around (*if_ioctl)().

ok bluhm


# 1.138 19-Apr-2023 kn

move kernel lock into multicast ioctl handlers; OK mvs


Revision tags: OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.138 19-Apr-2023 kn

move kernel lock into multicast ioctl handlers; OK mvs


Revision tags: OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.137 08-Sep-2022 kn

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.

Previous users pointed out by deraadt
OK bluhm


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.136 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.135 05-May-2022 claudio

Use static objects for struct rttimer_queue instead of dynamically
allocate them.

Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@


# 1.134 04-May-2022 claudio

Move rttimer callback function from the rttimer itself to rttimer_queue.
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.133 30-Apr-2022 claudio

Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.132 28-Apr-2022 claudio

In the multicast router code don't allocate a rt timer queue for each
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@


Revision tags: OPENBSD_7_1_BASE
# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.131 15-Dec-2021 deraadt

structure pads can leak uninitialized memory to userland via copyout,
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm


Revision tags: OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.130 27-May-2020 mpi

branches: 1.130.2; 1.130.6;
Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.130 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.129 15-Mar-2020 visa

Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.

This fixes a panic seen by jcs@.

OK mpi@


Revision tags: OPENBSD_6_6_BASE
# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.128 02-Sep-2019 bluhm

Fix a route use after free in multicast route. Move the rt_mcast_del()
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.127 21-Jun-2019 mpi

Prevent recursions by not deleting entries inside rtable_walk(9).

rtable_walk(9) now passes a routing entry back to the caller when
a non zero value is returned and if it asked for it.
This allows us to call rtdeletemsg()/rtrequest_delete() from the
caller without creating a recursion because of rtflushclone().

Multicast code hasn't been adapted and is still possibly creating
recursions. However multicast route entries aren't cloned so if
a recursion exists it isn't because of rtflushclone().

Fix stack exhaustion triggered by the use of "-msave-args".

Issue reported by D��niel L��vai on bugs@ confirmed by and ok bluhm@.


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.126 04-Jun-2019 anton

Add missing NULL check for the protocol control block (pcb) pointer in
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl
command can cause it to be NULL.

ok bluhm@ claudio@

Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com
Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_5_BASE
# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.125 13-Feb-2019 dlg

change rt_ifa_add and rt_ifa_del so they take an rdomain argument.

this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.

previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.

ok mpi@


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.124 10-Feb-2019 dlg

remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.

MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.

It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.

Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.

ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


Revision tags: OPENBSD_6_4_BASE
# 1.123 10-Oct-2018 reyk

RT_TABLEID_MAX is 255, fix places that assumed that it is less than 255.

rtable 255 is a valid routing table or domain id that wasn't handled
by the ip[6]_mroute code or by snmpd. The arrays in the ip[6]_mroute
code where off by one and didn't allocate space for rtable 255; snmpd
simply ignored rtable 255. All other places in the tree seem to
handle RT_TABLEID_MAX correctly.

OK florian@ benno@ henning@ deraadt@


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.122 30-Apr-2018 tb

Reduce the scope of the NET_LOCK() in in_control(). Two functions were
protected: mrt_ioctl() and in_ioctl(). The former has no other callers
and only needs a read lock. The latter will need refactoring to reduce
the lock's scope further. In a first step, establish a single exit point
and protect most of the function body with the NET_LOCK() while removing
the NET_LOCK() from a handful of callers.

suggested by & ok mpi, ok visa


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


Revision tags: OPENBSD_6_2_BASE
# 1.121 01-Sep-2017 mpi

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@


# 1.120 26-Jun-2017 mpi

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@


# 1.119 19-Jun-2017 bluhm

The IP multicast forward functions return an errno, call the variable
error. Make the ip_mforward() return value consistent. Simplify
the caller logic in ipv6_input() like in IPv4.
OK mpi@


# 1.118 16-May-2017 rzalamena

Sync three changes that were caught by IPv6 multicast routing review:

* use a variable to allow disabling debugs on run-time
* fix a potential memory leak on copyout() failure
* don't just blindly use the first address provided by ifalist

ok bluhm@


# 1.117 16-May-2017 rzalamena

Make return values more meaningful by using errno instead of -1 or 1.

ok bluhm@


# 1.116 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.115 16-May-2017 rzalamena

Let malloc() block when the caller of the add route function is
setsockopt(), otherwise use non-blocking malloc() for network stack
calls.

ok bluhm@


# 1.114 16-May-2017 rzalamena

Call rtfree() after each use of routes and make sure the route is valid
when finding one. Since rtfree() is being called and rt_llinfo being
removed, add checks everywhere to make sure we are using a route that is
not being removed.

ok bluhm@


# 1.113 06-Apr-2017 dhill

Convert bcopy to memcpy where the memory does not overlap, otherwise,
use memmove. While here, change some previous conversions to a simple
assignment.

ok deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.112 17-Mar-2017 rzalamena

Be more strict on all route iterations, lets always make sure that we
are not going to get a unicast route by accident.

ok mpi@


# 1.111 14-Mar-2017 rzalamena

Make mfc_find() more strict when looking for routes, fixes a problem
causing ip_mforward() not to send packets to the userland multicast
routing daemon.

Reported and tested by Paul de Weerd.

ok bluhm@, claudio@


# 1.110 09-Feb-2017 rzalamena

Unbreak 'netstat -g' and make multicast route stats sysctl more robust.

ok mpi@


# 1.109 08-Feb-2017 jsg

Test for NULL before dereferencing a pointer not after.
ok krw@


# 1.108 01-Feb-2017 dhill

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held. This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@


# 1.107 12-Jan-2017 rzalamena

Clean up multicast files from unused definitions and comments.

ok mpi@


# 1.106 11-Jan-2017 rzalamena

Remove mfc hash tables and use the OpenBSD routing table for multicast
routes. Beside the code simplification and removal, we also get to see
the multicast routes now in the route(8) utility.

ok mpi@


# 1.105 06-Jan-2017 rzalamena

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@


# 1.104 06-Jan-2017 rzalamena

Simplify code by removing some old pullup macro, killing some variables
and using m_dup_pkt() instead of m_copym() with max_linkhdr space adjust
on packet sending to avoid more mbuf allocations.

with input from millert@ and mikeb@,
ok mikeb@


# 1.103 06-Jan-2017 mpi

Kill various splsoftnet().

ok rzalamena@, visa@


# 1.102 05-Jan-2017 rzalamena

Remove some unnecessary code abstractions and while here remove a
splsoftnet.

ok mikeb@


# 1.101 22-Dec-2016 rzalamena

Remove PIM support from the multicast stack.

ok mpi@


# 1.100 21-Dec-2016 mpi

Fix build without PIM defined.


# 1.99 21-Dec-2016 rzalamena

Fix PIM compilation even though it is disabled.

ok bluhm@


# 1.98 20-Dec-2016 rzalamena

Call the multicast timer callback per domain instead of for all domains
this way we save doing big tables walk and iterating tables that we don't
need to.

ok mpi@


# 1.97 20-Dec-2016 rzalamena

Remove unused timeout that was never being set.

ok reyk@


# 1.96 19-Dec-2016 rzalamena

Kill unused function.

ok mpi@


# 1.95 19-Dec-2016 rzalamena

Extend the multicast sockets and multicast hash table support to multiple
domains. This is one step towards supporting to run more than one multicast
socket in different domains at the same time.

ok mpi@


# 1.94 13-Dec-2016 rzalamena

Propagate the routing table id in ip_mrouter_set() so the MRT_ADD_VIF
calls won't fail anymore when doing from a different rdomain.

ok mpi@


# 1.93 29-Nov-2016 mpi

Kill unused 'struct route'.


# 1.92 29-Nov-2016 jsg

m_free() and m_freem() test for NULL. Simplify callers which had their own
NULL tests.

ok mpi@


# 1.91 24-Sep-2016 tedu

use hashfree. from Mathieu -
ok guenther


Revision tags: OPENBSD_6_0_BASE
# 1.90 07-Mar-2016 naddy

Sync no-argument function declaration and definition by adding (void).
ok mpi@ millert@


Revision tags: OPENBSD_5_9_BASE
# 1.89 14-Nov-2015 mpi

Remove mrtdebug and reduce differences with the v6 version.

Debug informations can already be accessed via mrtstat and pimstat.


# 1.88 13-Nov-2015 mpi

Do not cast malloc(9) results.


# 1.87 13-Nov-2015 mpi

Kill another tunnel leftover and keep PIM stuff inside #ifdef PIM.


# 1.86 12-Nov-2015 mpi

Kill another leftover from the tunnel support removal and add more PIM.


# 1.85 12-Nov-2015 mpi

Sync headers and get rid of #ifdef MROUTING.


# 1.84 12-Nov-2015 mpi

Remove VIFF_TUNNEL leftovers, tunnels aren't supported since 2006.

Even pimd(8) no longer support them.


# 1.83 12-Nov-2015 mpi

Fix PIM build.


# 1.82 12-Sep-2015 mpi

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@


# 1.81 01-Sep-2015 bluhm

Replace sockaddr casts with the proper satosin(), ... calls.
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@


# 1.80 24-Aug-2015 bluhm

In kernel initialize struct sockaddr_in and sockaddr_in6 to zero
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.79 15-Jul-2015 deraadt

rename mbuf ** parameter from m to mp, to match other similar code


# 1.78 30-Jun-2015 mpi

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@


Revision tags: OPENBSD_5_7_BASE
# 1.77 09-Feb-2015 claudio

Implement 2 sysctl to retrieve the multicast forwarding cache (mfc) and the
virtual interface table (vif). Will be used by netstat soon.
Looked over by guenther@


# 1.76 08-Feb-2015 claudio

De-static to make ddb hangman harder. OK phessler, henning


# 1.75 07-Feb-2015 dlg

mechanical conversion of this code to using siphash instead of some xors.

ok tedu@ claudio@


# 1.74 17-Dec-2014 mpi

Remove the "multicast_" prefix from the fields a multicast-only struct.

Prodded by claudio@ and mikeb@


# 1.73 17-Dec-2014 mpi

Use an interface index instead of a pointer for multicast options.

Output interface (port) selection for multicast traffic is not done via
route lookups. Instead the output ifp is registred when setsockopt(2)
is called with the IP{V6,}_MULTICAST_IF option. But since there is no
mechanism to invalidate such pointer stored in a pcb when an interface
is destroyed/removed, it might lead your kernel to fault.

Prevent a fault upon resume reported by frantisek holop, thanks!

ok mikeb@, claudio@


# 1.72 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


# 1.71 30-Sep-2014 jsg

add back the sys/sysctl.h include removed in rev 1.60
fixes the kernel build when PIM is defined


# 1.70 14-Aug-2014 mpi

No need for raw_cb.h


# 1.69 14-Aug-2014 mpi

Kill MRT_{ADD,DEL}_BW_UPCALL interfaces and the bandwidth monitoring
code that comes with them.

ok mikeb@, henning@


Revision tags: OPENBSD_5_6_BASE
# 1.68 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.67 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


# 1.66 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.65 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.64 09-Jan-2014 tedu

bzero/bcmp -> memset/memcmp. ok matthew


# 1.63 27-Oct-2013 deraadt

delete UPCALL_TIMING debug code from a the dark ages


# 1.62 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_4_BASE
# 1.61 02-May-2013 mpi

tedu broken Resource Reservation Protocol code that was ifdef RSVP_ISI.

ok deraadt@, tedu@ (implicit)


# 1.60 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.59 04-Apr-2011 henning

de-guttenberg our stack a bit
we don't need 7 f***ing copies of the same code to do the protocol checksums
(or not, depending on hw capabilities). claudio ok


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.58 02-Jul-2010 blambert

m_copyback can fail to allocate memory, but is a void fucntion so gymnastics
are required to detect that.

Change the function to take a wait argument (used in nfs server, but
M_NOWAIT everywhere else for now) and to return an error

ok claudio@ henning@ krw@


# 1.57 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.56 01-Aug-2009 blambert

timeout_add -> timeout_add_msec

ok michele@ claudio@


# 1.55 13-Jul-2009 michele

Get rid of the token bucket filter.
Traffic shaping code should not be inside routing code.
If you want to rate-limit use altq instead.

ok claudio@ henning@ dlg@


# 1.54 09-Jul-2009 michele

Use MAXTTL instead of the hardcoded value.


Revision tags: OPENBSD_4_6_BASE
# 1.53 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


Revision tags: OPENBSD_4_5_BASE
# 1.52 16-Sep-2008 chl

remove another dead store.

spotted by markus@

ok henning@ mpf@


# 1.51 15-Sep-2008 chl

remove dead stores and newly created unused variables.

Found by LLVM/Clang Static Analyzer.

ok mpf@ looks good mk@ ok henning@


Revision tags: OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.50 02-Jan-2008 brad

return with ENOTTY instead of EINVAL for unknown ioctl requests.

ok claudio@ krw@ dlg@


# 1.49 14-Dec-2007 deraadt

add sysctl entry points into various network layers, in particular to
provide netstat(1) with data it needs; ok claudio reyk


Revision tags: OPENBSD_4_2_BASE
# 1.48 22-May-2007 michele

ip_mroute.c is in bad shape.
This first step makes it style(9) compliant.
Just a whitespace diff, no binary change.

OK claudio@ norby@


# 1.47 10-Apr-2007 miod

``it's'' -> ``its'' when the grammar gods require this change.


Revision tags: OPENBSD_4_1_BASE
# 1.46 14-Feb-2007 jsg

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@


Revision tags: OPENBSD_4_0_BASE
# 1.45 15-Jun-2006 pascoe

Change cast of last vararg to ip_output to match what ip_output expects,
for clarity.

henning@ claudio@ ok


# 1.44 11-May-2006 hshoexer

fix corruption of pim register packets. From Hideki ONO, thanks!

ok mcbride@ itojun@


# 1.43 25-Apr-2006 claudio

Remove virtual tunnel support from the mrouting code. The virtual tunnel
code breaks multicast on gif(4) interfaces and it is far better to configure
a real gif(4) tunnel instead of a multicast tunnel as the latter is almost
not manageable. OK norby@, mblamer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE
# 1.42 25-Apr-2005 brad

csum -> csum_flags

ok krw@ canacar@


Revision tags: OPENBSD_3_7_BASE
# 1.41 15-Jan-2005 brad

fix comment


# 1.40 14-Jan-2005 mcbride

Duplicate nested if statement in PIM code.

From brad@


# 1.39 14-Jan-2005 mcbride

Add kernel support for Protocol Independant Multicast (PIM)
Information: http://netweb.usc.edu/pim/

From Pavlin Radoslavov <pavlin@icir.org>

ok deraadt@ brad@


# 1.38 24-Nov-2004 mcbride

Multicast routing cleanup from Pavlin Radoslavov
- sync ip_mroute.c with NetBSD
- import some FreeBSD changes to MFC entry handling
- set im->im_vif correctly when sending IGMPMSG_WRONGVIF
- increment mrtstat.mrts_upcalls correctly
- return error from get_sg_cnt() if there is no matching forwarding entry

ok henning@ brad@ naddy@


Revision tags: OPENBSD_3_6_BASE
# 1.37 24-Aug-2004 brad

Don't allow SIOCGET{VIF,SG}CNT from sockets other than the multicast router.

From NetBSD
Fixes PR 3825

ok mcbride@ canacar@ claudio@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.36 06-Jan-2004 markus

fix vlan destroy for MROUTING; report spamme@wouz.dk via tedu; ok itojun


# 1.35 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


# 1.34 10-Dec-2003 itojun

de-register. deraadt ok


Revision tags: OPENBSD_3_4_BASE
# 1.33 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


# 1.32 09-Jul-2003 itojun

better vif_delete (no dangling ref to struct ifnet). deraadt ok
it won't affect default GENERIC build - as MROUTING is not defined


# 1.31 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: UBC_SYNC_A
# 1.30 14-May-2003 itojun

KNF. markus ok


# 1.29 06-May-2003 deraadt

string cleaning; tedu ok


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.28 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.27 31-Jul-2002 itojun

remove $Id$


# 1.26 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.25 15-Mar-2002 millert

Kill #if __STDC__ used to do K&R vs. ANSI varargs/stdarg; just do things
the ANSI way.


# 1.24 14-Mar-2002 millert

First round of __P removal in sys


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.23 26-Sep-2001 deraadt

branches: 1.23.4;
bring back the old copyright notice


# 1.22 19-Aug-2001 miod

More old timeouts removal, mainly affected unused/unmaintained code.


# 1.21 23-Jun-2001 fgsch

Remove unneeded ip_id convertions.
Instead of using HTONS macro in some places, use htons directly in the
struct member and save us a few bytes.
Fix comment.


Revision tags: OPENBSD_2_9_BASE
# 1.20 10-Nov-2000 provos

seperate -> separate, okay aaron@


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE SMP_BASE
# 1.19 21-Jan-2000 angelos

branches: 1.19.2;
Rename the ip4_* routines to ipip_*, make it so GIF tunnels are not
affected by net.inet.ipip.allow (the sysctl formerly known as
net.inet.ip4.allow), rename the VIF ipip_input to ipip_mroute_input.


Revision tags: OPENBSD_2_6_BASE kame_19991208
# 1.18 08-Aug-1999 niklas

undeclared variable


# 1.17 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.16 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


# 1.15 20-Apr-1999 niklas

Merge MROUTING and IPSEC wrt handling of IP-in-IP tunnelled packets.
Fix a panic case in the MROUTING code too. Drop M_TUNNEL support, nothing
ever uses it.


Revision tags: OPENBSD_2_5_BASE
# 1.14 05-Feb-1999 angelos

Clear mfchashtbl after deallocation (mycroft@netbsd)


# 1.13 08-Jan-1999 provos

dont call ip_randomid() in htons().


# 1.12 08-Jan-1999 deraadt

rip_input() should be called with a 0 terminator; cmetz


# 1.11 26-Dec-1998 provos

make ip_id random but ensure that ids dont repeat for some period.


Revision tags: OPENBSD_2_4_BASE
# 1.10 29-Jul-1998 angelos

Proper handling of IP in IP and checksumming.


# 1.9 03-Jul-1998 deraadt

wrong endian conversion caused vif stats to be wrong; jonny@jonny.eng.br


# 1.8 18-May-1998 provos

first step to the setsockopt/getsockopt interface as described in
draft-mcdonald-simple-ipsec-api, kernel notifies (EMT_REQUESTSA) signal
userland key management applications when security services are requested.
this is only for outgoing connections at the moment, incoming packets
are not yet checked against the selected socket policy.


Revision tags: OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 28-Sep-1997 deraadt

more \n in log()


Revision tags: OPENBSD_2_1_BASE
# 1.6 21-Feb-1997 angelos

Couple of missing ifdefs.


# 1.5 20-Feb-1997 deraadt

IPSEC package by John Ioannidis and Angelos D. Keromytis. Written in
Greece. From ftp.funet.fi:/pub/unix/security/net/ip/BSDipsec.tar.gz


Revision tags: OPENBSD_2_0_BASE
# 1.4 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.3 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision